Explore the latest in technology and cybersecurity with insightful blog posts, expert tips, and in-depth analysis. Stay informed, stay secure!

Demystifying Uncommon Log Sources into Microsoft Sentinel: What You Need to Know (with Real Client Examples + Code)

Posted by:

|

On:

|

Microsoft Sentinel provides native connectors for many popular services. However, when your environment includes custom-built applications, legacy systems, or niche appliances, you’ll often need to step outside of the out-of-the-box options. Ingesting uncommon log sources is essential for achieving complete visibility across your environment. Fortunately, with the right approach, it’s absolutely achievable.

In this post, I’ll walk you through the core strategies for ingestion, share real-world client scenarios, and provide code examples to get you started. Whether you’re handling obscure IoT devices or integrating custom log formats, Microsoft Sentinel can help you bring it all together.


🔍 1. Understand the Log Source First

Before doing anything else, take the time to thoroughly understand your log source. Ask yourself:

  • What format is the data in? (e.g., JSON, CSV, syslog, raw text)
  • Where is it stored or how is it transmitted? (Local file, API, blob storage, streaming?)
  • How frequently is it generated? (Real-time, hourly, daily?)
  • Can it be transformed or normalized before it hits Sentinel?

Taking this discovery step upfront will significantly reduce friction later on in the integration process.


⚙️ 2. Choose the Right Ingestion Method (w/ Client Examples)

Once you understand the source, the next step is to choose the appropriate ingestion method. There are several options depending on the data’s format, source, and transport method. Below are common methods with real-world examples.

✅ Azure Log Analytics Agent / AMA

Pharmaceutical Client Example:
A legacy Windows-based app wrote event logs locally. To ingest these, we installed the Azure Monitor Agent (AMA), configured a custom Data Collection Rule (DCR), and filtered for key event IDs, such as failed login attempts and privilege changes. As a result, only the most relevant security events were collected.


✅ Syslog/CEF via Linux Forwarder

Communications Client Example:
On-premises Fortinet firewalls lacked a native Sentinel connector. Therefore, we deployed a Linux syslog collector with the AMA agent installed. We then configured the firewall to send logs in CEF format.

Syslog Config Sample:

bashCopyEdit# /etc/rsyslog.d/fortinet.conf
$template FortiFormat,"%msg%\n"
:programname, isequal, "FortiGate" /var/log/fortinet/fortigate.log;FortiFormat

This approach ensured compatibility with Sentinel’s built-in parsing logic for CEF.


✅ Azure Function + HTTP Data Collector API

Retail Client Example:
A third-party POS system provided log access via API. In this case, we created an Azure Function that polls the endpoint, formats the data, and forwards it to Sentinel using the HTTP Data Collector API.

Azure Function Sample (Python):

pythonCopyEditimport requests, json, datetime, hashlib, hmac, base64

customer_id = 'workspace-id'
shared_key = 'your-primary-key'
log_type = 'POS_Transactions'
resource = '/api/logs'
rfc1123date = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')

body = json.dumps(requests.get('https://api.thirdpartypos.com/logs').json())
content_length = len(body)
string_to_hash = f"POST\n{content_length}\napplication/json\n{rfc1123date}\n{resource}"
signed = base64.b64encode(hmac.new(
    base64.b64decode(shared_key),
    string_to_hash.encode(),
    hashlib.sha256).digest()).decode()

headers = {
    'Content-Type': 'application/json',
    'Authorization': f'SharedKey {customer_id}:{signed}',
    'Log-Type': log_type,
    'x-ms-date': rfc1123date
}

uri = f'https://{customer_id}.ods.opinsights.azure.com{resource}?api-version=2016-04-01'
requests.post(uri, data=body, headers=headers)

By automating this process, we eliminated manual log pulls and ensured near real-time data availability in Sentinel.


✅ Blob Storage or Event Hub

Manufacturing Client Example:
In another case, an IoT gateway exported logs as CSV to Azure Blob Storage every hour. To ingest them, we used a Logic App that parsed each file and wrote the normalized data to a custom Sentinel table.

This method worked particularly well for handling bursty data flows from edge devices.


🧹 3. Normalize and Parse Logs

After ingestion, the next critical step is log normalization. Raw logs—especially from homegrown applications—often lack structure. Fortunately, Kusto Query Language (KQL) gives you the tools to extract and reshape data for analytics and detection.

University Client Example:
One in-house app logged lines like this:

iniCopyEdit[2025-04-02 08:13:22] user=jdoe action=login status=failed ip=10.10.10.15

To parse it into usable fields, we used the following KQL:

kqlCopyEditCustomApp_CL
| extend Timestamp = todatetime(extract(@"\[(.*?)\]", 1, RawData))
| extend User = extract(@"user=(\S+)", 1, RawData)
| extend Action = extract(@"action=(\S+)", 1, RawData)
| extend Status = extract(@"status=(\S+)", 1, RawData)
| extend IPAddress = extract(@"ip=(\S+)", 1, RawData)
| project Timestamp, User, Action, Status, IPAddress

As a result, SOC analysts could now use filters, alerts, and visualizations that made sense.


⚠️ 4. Build Analytics Rules

Once logs are structured, you can start writing detection logic. Analytics rules allow you to catch suspicious patterns and trigger alerts.

Healthcare Client Example:
We ingested VPN logs that included geolocation metadata. To detect “country hopping” (a common sign of account compromise), we built this rule:

kqlCopyEditVPNLogs_CL
| extend Country = extract("country\":\"([^\"]+)", 1, RawData)
| summarize Count = count(), Countries = make_set(Country) by User, bin(TimeGenerated, 1h)
| where array_length(Countries) > 1

This helped the SOC proactively investigate suspicious login activity.


📊 5. Create Workbooks for Visibility

Visualizations are essential for giving your SOC quick context.

For example, here’s a simple chart that shows VPN login activity by country:

kqlCopyEditVPNLogs_CL
| extend Country = extract("country\":\"([^\"]+)", 1, RawData)
| summarize Logins = count() by bin(TimeGenerated, 1h), Country
| render timechart

This type of workbook helped clients quickly spot trends and anomalies across user behavior.


⚠️ Common Challenges to Expect

Even with the right tools, you may run into a few obstacles:

  • Inconsistent timestamps — Fix using parse_datetime() or handle in your ingestion script
  • High ingestion volume — Monitor data caps in Log Analytics, and optimize Data Collection Rules (DCRs)
  • Trial-and-error parsing — Expect to tweak your KQL as log formats evolve
  • Silent failures — Always validate ingestion with sample data before full deployment

Planning ahead and iterating frequently will save you a lot of troubleshooting time.


✅ Final Thoughts

Uncommon log sources can be intimidating, but they often hold critical insights for security teams. By using a mix of Azure Functions, Logic Apps, Syslog Forwarders, and KQL parsing, you can bring nearly any data source into Sentinel and turn it into actionable intelligence.

Whether your goal is detection, compliance, or full visibility, ingesting these custom logs helps close blind spots and strengthens your security posture.

Posted by

in

Leave a Reply

Your email address will not be published. Required fields are marked *