Skip to content

fix(rules): improve precision of 4 high-FP dotnet opengrep rules#63

Open
David Larsen (dc-larsen) wants to merge 3 commits intomainfrom
fix/dotnet-sast-rule-precision
Open

fix(rules): improve precision of 4 high-FP dotnet opengrep rules#63
David Larsen (dc-larsen) wants to merge 3 commits intomainfrom
fix/dotnet-sast-rule-precision

Conversation

@dc-larsen
Copy link
Copy Markdown
Contributor

@dc-larsen David Larsen (dc-larsen) commented Apr 11, 2026

Summary

Fixes 4 dotnet opengrep rules that produced 150 of 170 total false positives (88%) in a customer SAST evaluation, inflating the reported FP rate to 91%.

  • dotnet-xss-response-write: Converted to taint mode. Was matching any .Write() including Serilog ITextFormatter log sinks (74 FPs). Now tracks data flow from user input to Response.Write.
  • dotnet-hardcoded-credentials: Added value inspection and credential API patterns. Was matching variable names alone, flagging config paths like UseCaptchaOnResetPassword (31 FPs).
  • dotnet-crypto-failures: Rewrote to target weak algorithms (3DES/DES/RC2/RijndaelManaged). Was flagging Encoding.UTF8.GetBytes() which triggers on the recommended SHA256.HashData() pattern (30 FPs).
  • dotnet-path-traversal: Converted to taint mode. Was matching all Path.Combine() calls including framework paths like _env.WebRootPath (15 FPs).

Benchmark data (NIST Juliet C# Test Suite)

Rule Before Precision After Precision Before Recall After Recall
xss-response-write 41.6% 100% 47.8% 24.3%
hardcoded-credentials 0% 100% 0% 3.6%
crypto-failures 36.7% 100% 51.4% 50.0%
path-traversal 0% 100% 0% 45.2%

All 4 rules achieve 100% precision (zero false positives) post-fix. Recall trade-offs are acceptable: taint-mode rules only fire when user input actually reaches the sink, which is the correct behavior for security analysis.

Customer impact

Eliminates all 150 FPs from these 4 rules. Remaining findings (36 total, 20 FP) produce a ~56% FP rate, consistent with pattern-matching SAST tools. Further tuning via community rules and per-language scoping can reduce this further.

Testing

  • opengrep --validate passes on full dotnet.yml (40 rules, 0 errors)
  • Ran fixed rules through opengrep v1.19.0 against NIST Juliet C# test cases (4,300+ files)
  • Results match between opengrep and semgrep (identical TP/FP/FN counts)
  • Verified zero false positives across all 4 fixed rules
  • pytest passes (139 tests, 0 failures)
  • End-to-end scan through Socket Basics pipeline on a .NET repo

@dc-larsen David Larsen (dc-larsen) requested a review from a team as a code owner April 11, 2026 04:22
@dc-larsen David Larsen (dc-larsen) changed the title fix(rules): improve precision of 4 high-FP dotnet Semgrep rules fix(rules): improve precision of 4 high-FP dotnet opengrep rules Apr 11, 2026
Addresses customer SAST evaluation feedback where 4 rules produced 150/170
false positives (88% of all FPs), inflating the reported FP rate to 91%.

Rules fixed:
- dotnet-xss-response-write: Convert to taint mode. Previously matched any
  .Write() call including Serilog ITextFormatter log sinks. Now requires
  data flow from user input sources to Response.Write sinks.
- dotnet-hardcoded-credentials: Add value inspection and credential API
  patterns. Previously matched on variable names alone, flagging config
  key paths like "UseCaptchaOnResetPassword".
- dotnet-crypto-failures: Target actual weak algorithms (3DES, DES, RC2,
  RijndaelManaged) instead of Encoding.UTF8.GetBytes() which flagged the
  recommended SHA256.HashData(Encoding.UTF8.GetBytes(...)) pattern.
- dotnet-path-traversal: Convert to taint mode. Previously matched all
  Path.Combine() calls including those using framework-provided paths
  like _env.WebRootPath.

Validated with opengrep v1.19.0 against NIST Juliet C# test suite:
  xss-response-write:    Prec 41.6% -> 100%, Recall 47.8% -> 24.3%
  hardcoded-credentials: Prec 0.0%  -> 100%, Recall 0.0%  -> 3.6%
  crypto-failures:       Prec 36.7% -> 100%, Recall 51.4% -> 50.0%
  path-traversal:        Prec 0.0%  -> 100%, Recall 0.0%  -> 45.2%
@dc-larsen David Larsen (dc-larsen) force-pushed the fix/dotnet-sast-rule-precision branch from 4958b6f to cdb7224 Compare April 11, 2026 11:36
@dc-larsen
Copy link
Copy Markdown
Contributor Author

David Larsen (dc-larsen) commented Apr 11, 2026

E2E Validation: dotnet rule precision (round 1)

Validated the updated rules against two intentionally vulnerable .NET repositories using opengrep v1.19.0 and the full socket-basics pipeline.

Test Targets

Repo Framework Description
the-most-vulnerable-dotnet-app .NET 10.0 Educational repo showcasing common .NET vulnerabilities
AspGoat ASP.NET Core 8.0 OWASP vulnerable ASP.NET Core app

Results

the-most-vulnerable-dotnet-app

Rule Before After Notes
dotnet-hardcoded-credentials 8 0 Correctly filters out empty-string DTO property defaults (Password = "")
dotnet-path-traversal 15 0 Eliminates static Path.Combine matches (framework paths, hardcoded strings)
dotnet-xss-response-write 0 0 Repo uses Response.WriteAsync() (ASP.NET Core), not classic Response.Write()
dotnet-crypto-failures 0 0 No weak crypto algorithms (3DES/DES/RC2/Rijndael) present

AspGoat

Rule Before After Notes
dotnet-hardcoded-credentials 1 1 True positive preserved: defaultPassword = "admin123"
dotnet-path-traversal 1 0 Taint sources cover classic ASP.NET; ASP.NET Core patterns addressed in round 2

Pipeline Integration

Check Status
opengrep connector loads Pass
.socket.facts.json output valid Pass
Socket API submission Pass
Alert metadata (CWE, OWASP, fix, references) Present and correct

Observation

The taint-mode rules correctly eliminate pattern-match false positives. Initial taint sources target classic ASP.NET WebForms (Request.QueryString, Request.Form). ASP.NET Core controller parameter binding ([FromQuery], [FromBody], IFormFile) and Response.WriteAsync coverage added in round 2.

@dc-larsen
Copy link
Copy Markdown
Contributor Author

David Larsen (dc-larsen) commented Apr 11, 2026

E2E Validation: ASP.NET Core coverage added (round 2)

Extended taint sources and sinks to cover ASP.NET Core patterns, then re-validated against both test repos.

Changes

Sources (both path-traversal and xss-response-write):

  • Controller parameter binding: [FromQuery], [FromBody], [FromRoute], [FromForm]
  • File upload: IFormFile.FileName, IFormFile.ContentType

Sinks (xss-response-write):

  • Response.WriteAsync(...), HttpContext.Response.WriteAsync(...), Html.Raw(...)

Sinks (path-traversal):

  • Fully-qualified System.IO.File.* variants (ReadAllText, WriteAllText, ReadAllBytes, WriteAllBytes, Exists, Open, Delete) for ASP.NET Core code that uses explicit namespace qualification

Results

the-most-vulnerable-dotnet-app

Rule Before After (round 2) Findings
dotnet-path-traversal 15 false positives 4 true positives [FromQuery] filenameFile.Exists/File.ReadAllText; [FromBody] requestFile.WriteAllText. All with dataflow traces.
dotnet-xss-response-write 0 1 true positive [FromQuery] urlUrlDecode → string concat → Response.WriteAsync. Dataflow trace present.
dotnet-hardcoded-credentials 8 false positives 0 Empty-string defaults correctly filtered.

AspGoat

Rule Before After (round 2) Findings
dotnet-path-traversal 1 false positive 1 true positive IFormFile.FileNamePath.Combinenew FileStream. Dataflow trace present.
dotnet-hardcoded-credentials 1 true positive 1 true positive defaultPassword = "admin123" preserved.

Validation

Check Result
opengrep --validate 0 errors, 40 rules
pytest 139 passed
Socket API (round 2 scans) All 7 findings visible in dashboard
False positives (target rules) 0 across both repos
True positives (target rules) 7 across both repos

…net rules

Add controller parameter binding sources ([FromQuery], [FromBody],
[FromRoute], [FromForm]) and IFormFile.FileName to path-traversal
and XSS taint rules. Add Response.WriteAsync and Html.Raw as XSS
sinks. Add fully-qualified System.IO.File.* sink variants for
ASP.NET Core code that uses explicit namespace qualification.

E2E tested against two vulnerable .NET repos: 7 true positives
found, zero false positives.
Copy link
Copy Markdown
Contributor

@lelia lelia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

David Larsen (@dc-larsen) Thanks for putting this together! Reducing 150 false positives from 4 rules is meaningful, and the move toward taint-style matching is directionally the right fix for the worst offenders here. I also validated the updated dotnet.yml on my end with opengrep --validate, and the config is syntactically valid.

I do see a few semantic issues worth tightening before merge - summarized here, but see my review comments for more details:

  1. dotnet-path-traversal: Path.GetFullPath(...) is currently treated as a sanitizer at socket_basics/rules/dotnet.yml:410
  2. dotnet-hardcoded-credentials: the variable-name regex got much narrower at socket_basics/rules/dotnet.yml:161
  3. dotnet-crypto-failures: one broad pattern still looks FP-prone at socket_basics/rules/dotnet.yml:805

Overall, I think the main idea in this PR is solid: move away from simple pattern matching where it was clearly overfiring, and add more context-aware logic. I’d just recommend tightening the three cases above so we don’t trade one class of false positives for a quieter set of false negatives or misclassifications.

Comment thread socket_basics/rules/dotnet.yml Outdated
- pattern: Directory.EnumerateFiles(...)
pattern-sanitizers:
- pattern-either:
- pattern: Path.GetFullPath(...)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetFullPath only normalizes the path; it does not make attacker-controlled input safe on its own. For example, ../../etc/passwd becomes a full absolute path, but it can still point outside the intended directory. In practice, code is only safe after the normalized path is checked against an allowed base directory. As written, the rule may now miss real path traversal cases.

Comment thread socket_basics/rules/dotnet.yml Outdated
string $VAR = "$VALUE";
- metavariable-regex:
metavariable: $VAR
regex: (?i).*(password|passwd|pwd|secret|token|api_key|connection_string).*
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new matcher looks for names like api_key and connection_string, but idiomatic C# code often uses apiKey, connectionString, privateKey, accessKey, etc. That means this version is likely to miss common .NET secret names even though it improves precision.

Comment thread socket_basics/rules/dotnet.yml Outdated
- pattern: new RijndaelManaged() { Key = Encoding.UTF8.GetBytes($KEY) }
- pattern: new AesCryptoServiceProvider() { Key = Encoding.UTF8.GetBytes($KEY) }
# Encoding password for storage without hashing (storing plaintext)
- pattern: Convert.ToBase64String(Encoding.UTF8.GetBytes($SECRET))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert.ToBase64String(Encoding.UTF8.GetBytes($SECRET)) is not necessarily "weak crypto" by itself. It can also appear in benign encoding/transport code. So while the rule is much better than before, this particular pattern could still generate noise and doesn’t map as cleanly to actual weak algorithm usage as the rest of the rewrite.

- Remove Path.GetFullPath() as path-traversal sanitizer (normalizes but
  does not prevent traversal on its own)
- Broaden hardcoded-credentials variable regex to cover idiomatic C#
  naming: apiKey, connectionString, privateKey, accessKey, authToken
- Remove overly broad Base64 encoding pattern from crypto-failures
  (benign encoding/transport use generates noise)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants