Microsoft Sentinel provides native connectors for many popular services. However, when your environment includes custom-built applications, legacy systems, or niche appliances, you’ll often need to step outside of the out-of-the-box options. Ingesting uncommon log sources is essential for achieving complete visibility across your environment. Fortunately, with the right approach, it’s absolutely achievable.
In this post, I’ll walk you through the core strategies for ingestion, share real-world client scenarios, and provide code examples to get you started. Whether you’re handling obscure IoT devices or integrating custom log formats, Microsoft Sentinel can help you bring it all together.
🔍 1. Understand the Log Source First
Before doing anything else, take the time to thoroughly understand your log source. Ask yourself:
- What format is the data in? (e.g., JSON, CSV, syslog, raw text)
- Where is it stored or how is it transmitted? (Local file, API, blob storage, streaming?)
- How frequently is it generated? (Real-time, hourly, daily?)
- Can it be transformed or normalized before it hits Sentinel?
Taking this discovery step upfront will significantly reduce friction later on in the integration process.
⚙️ 2. Choose the Right Ingestion Method (w/ Client Examples)
Once you understand the source, the next step is to choose the appropriate ingestion method. There are several options depending on the data’s format, source, and transport method. Below are common methods with real-world examples.
✅ Azure Log Analytics Agent / AMA
Pharmaceutical Client Example:
A legacy Windows-based app wrote event logs locally. To ingest these, we installed the Azure Monitor Agent (AMA), configured a custom Data Collection Rule (DCR), and filtered for key event IDs, such as failed login attempts and privilege changes. As a result, only the most relevant security events were collected.
✅ Syslog/CEF via Linux Forwarder
Communications Client Example:
On-premises Fortinet firewalls lacked a native Sentinel connector. Therefore, we deployed a Linux syslog collector with the AMA agent installed. We then configured the firewall to send logs in CEF format.
Syslog Config Sample:
bashCopyEdit# /etc/rsyslog.d/fortinet.conf
$template FortiFormat,"%msg%\n"
:programname, isequal, "FortiGate" /var/log/fortinet/fortigate.log;FortiFormat
This approach ensured compatibility with Sentinel’s built-in parsing logic for CEF.
✅ Azure Function + HTTP Data Collector API
Retail Client Example:
A third-party POS system provided log access via API. In this case, we created an Azure Function that polls the endpoint, formats the data, and forwards it to Sentinel using the HTTP Data Collector API.
Azure Function Sample (Python):
pythonCopyEditimport requests, json, datetime, hashlib, hmac, base64
customer_id = 'workspace-id'
shared_key = 'your-primary-key'
log_type = 'POS_Transactions'
resource = '/api/logs'
rfc1123date = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')
body = json.dumps(requests.get('https://api.thirdpartypos.com/logs').json())
content_length = len(body)
string_to_hash = f"POST\n{content_length}\napplication/json\n{rfc1123date}\n{resource}"
signed = base64.b64encode(hmac.new(
base64.b64decode(shared_key),
string_to_hash.encode(),
hashlib.sha256).digest()).decode()
headers = {
'Content-Type': 'application/json',
'Authorization': f'SharedKey {customer_id}:{signed}',
'Log-Type': log_type,
'x-ms-date': rfc1123date
}
uri = f'https://{customer_id}.ods.opinsights.azure.com{resource}?api-version=2016-04-01'
requests.post(uri, data=body, headers=headers)
By automating this process, we eliminated manual log pulls and ensured near real-time data availability in Sentinel.
✅ Blob Storage or Event Hub
Manufacturing Client Example:
In another case, an IoT gateway exported logs as CSV to Azure Blob Storage every hour. To ingest them, we used a Logic App that parsed each file and wrote the normalized data to a custom Sentinel table.
This method worked particularly well for handling bursty data flows from edge devices.
🧹 3. Normalize and Parse Logs
After ingestion, the next critical step is log normalization. Raw logs—especially from homegrown applications—often lack structure. Fortunately, Kusto Query Language (KQL) gives you the tools to extract and reshape data for analytics and detection.
University Client Example:
One in-house app logged lines like this:
iniCopyEdit[2025-04-02 08:13:22] user=jdoe action=login status=failed ip=10.10.10.15
To parse it into usable fields, we used the following KQL:
kqlCopyEditCustomApp_CL
| extend Timestamp = todatetime(extract(@"\[(.*?)\]", 1, RawData))
| extend User = extract(@"user=(\S+)", 1, RawData)
| extend Action = extract(@"action=(\S+)", 1, RawData)
| extend Status = extract(@"status=(\S+)", 1, RawData)
| extend IPAddress = extract(@"ip=(\S+)", 1, RawData)
| project Timestamp, User, Action, Status, IPAddress
As a result, SOC analysts could now use filters, alerts, and visualizations that made sense.
⚠️ 4. Build Analytics Rules
Once logs are structured, you can start writing detection logic. Analytics rules allow you to catch suspicious patterns and trigger alerts.
Healthcare Client Example:
We ingested VPN logs that included geolocation metadata. To detect “country hopping” (a common sign of account compromise), we built this rule:
kqlCopyEditVPNLogs_CL
| extend Country = extract("country\":\"([^\"]+)", 1, RawData)
| summarize Count = count(), Countries = make_set(Country) by User, bin(TimeGenerated, 1h)
| where array_length(Countries) > 1
This helped the SOC proactively investigate suspicious login activity.
📊 5. Create Workbooks for Visibility
Visualizations are essential for giving your SOC quick context.
For example, here’s a simple chart that shows VPN login activity by country:
kqlCopyEditVPNLogs_CL
| extend Country = extract("country\":\"([^\"]+)", 1, RawData)
| summarize Logins = count() by bin(TimeGenerated, 1h), Country
| render timechart
This type of workbook helped clients quickly spot trends and anomalies across user behavior.
⚠️ Common Challenges to Expect
Even with the right tools, you may run into a few obstacles:
- Inconsistent timestamps — Fix using
parse_datetime()
or handle in your ingestion script - High ingestion volume — Monitor data caps in Log Analytics, and optimize Data Collection Rules (DCRs)
- Trial-and-error parsing — Expect to tweak your KQL as log formats evolve
- Silent failures — Always validate ingestion with sample data before full deployment
Planning ahead and iterating frequently will save you a lot of troubleshooting time.
✅ Final Thoughts
Uncommon log sources can be intimidating, but they often hold critical insights for security teams. By using a mix of Azure Functions, Logic Apps, Syslog Forwarders, and KQL parsing, you can bring nearly any data source into Sentinel and turn it into actionable intelligence.
Whether your goal is detection, compliance, or full visibility, ingesting these custom logs helps close blind spots and strengthens your security posture.
Leave a Reply