Semgrep Versions Save

Lightweight static analysis for many languages. Find bug variants with patterns that look like source code.

v1.65.0

2 months ago

1.65.0 - 2024-03-11

Changed

  • Removed the extract-mode rules experimental feature. (extract_mode)

v1.64.0

2 months ago

1.64.0 - 2024-03-07

Changed

  • Removed the AST caching experimental feature (--experimental --ast-caching in osemgrep and -parsing_cache_dir in semgrep-core). (ast_caching)
  • Removed the Registry caching experimental feature (--experimental --registry-caching) in osemgrep. (registry_caching)

Fixed

  • Clean any credentials from project URL before using it, to prevent leakage. (saf-876)
  • ci: Updated logic for informational message printed when no rules are sent to correctly display when secrets is enabled (in additional to when code is). (scrt-455)

v1.63.0

2 months ago

1.63.0 - 2024-02-27

Added

  • Dataflow: Added support for nested record patterns such as { body: { param } } in the LHS of an assignment. Now given { body: { param } } = tainted Semgrep will correctly mark param as tainted. (flow-68)
  • Matching: metavariable-regex can now match on metavariables of interpolated strings which use variables that have known values. (saf-865)
  • Add support for parsing Swift Package Manager manifest and lockfiles (sc-1217)

Fixed

  • fix: taint signatures do not capture changes to parameters' fields (flow-70)
  • Scan summary links printed after semgrep ci scans now reflect a custom SEMGREP_APP_URL, if one is set. (saf-353)

v1.62.0

2 months ago

1.62.0 - 2024-02-22

Added

  • Pro: Adds support for python constructors to taint analysis.

    If interfile naming resolves that a python constructor is called taint will now track these objects with less heuristics. Without interfile analysis these changes have no effect on the behavior of tainting. The overall result is that in the following program the oss analysis would match both calls to sink while the interfile analysis would only match the second call to sink.

    class A:
        untainted = "not"
        tainted = "not"
        def __init__(self, x):
        	self.tainted = x
    
    a = A("tainted")
    # OK:
    sink(a.untainted)
    # MATCH:
    sink(a.tainted)
    ``` (ea-272)
    
  • Pro: taint-mode: Added basic support for "index sensitivity", that is, Semgrep will track taint on individual indexes of a data structure when these are constant values (integers or strings), and the code uses the built-in syntax for array indexing in the corresponding language (typically E[i]). For example, in the Python code below Semgrep Pro will not report a finding on sink(x) or sink(x[1]) because it will know that only x[42] is tainted:

    x[1] = safe
    x[42] = source()
    sink(x) // no more finding
    sink(x[1]) // no more finding
    sink(x[42]) // finding
    sink(x[i]) // finding
    

    There is still a finding for sink(x[i]) when i is not constant. (flow-7)

Changed

  • taint-mode: Added exact: false sinks so that one can specify that anything inside a code region is a sink, e.g. if (...) { ... }. This used to be the semantics of sink specifications until Semgrep 1.1.0, when we made sink matching more precise by default. Now we allow reverting to the old semantics.

    In addition, when exact: true (the default), we simplified the heuristic used to support traditional sink(...)-like specs together with the option taint_assume_safe_functions: true, now we will consider that if the spec formula is not a patterns with a focus-metavarible, then we must look for taint in the arguments of a function call. (flow-1)

  • The project name for repos scanned locally will now be local_scan/<repo_name> instead of simply <repo_name>. This will clarify the origin of those findings. Also, the "View Results" URL displayed for findings now includes the repository and branch names. (saf-856)

Fixed

  • taint-mode: experimental: For now Semgrep CLI taint traces are not adapted to support multiple labels, so Semgrep picks one arbitrary label to report, which sometimes it's not the desired one. As a temporary workaround, Semgrep will look at the requires of the sink, and if it has the shape A and ..., then it will pick A as the preferred label and report its trace. (flow-65)
  • Fixed trailing newline parsing in pyproject.toml and poetry.lock files. (gh-9777)
  • Fixed an issue that led to incorrect autofix application in certain cases where multiple fixes were applied to the same line. (saf-863)
  • The tokens for type parameters brackets are now stored in the generic AST allowing to correctly autofix those constructs. (tparams)

v1.61.1

3 months ago

1.61.1 - 2024-02-14

Added

  • Added performance metrics using OpenTelemetry for better visualization. Users wishing to understand the performance of their Semgrep scans or to help optimize Semgrep can configure the backend collector created in libs/tracing/unix/Tracing.ml.

    This is experimental and both the implementation and flags are likely to change. (ea-320)

  • Created a new environment variable SEMGREP_REPO_DISPLAY_NAME for use in semgrep CI. Currently, this does nothing. The goal is to provide a way to override the display name of a repo in the Semgrep App. (gh-8953)

  • The OCaml/C executable (semgrep-core or osemgrep) is now passed through the strip utility, which reduces its size by 10-25% depending on the platform. Contribution by Filipe Pina (@fopina). (gh-9471)

Changed

  • "Missing plugin" errors (i.e., rules that cannot be run without --pro) will now be grouped and reported as a single warning. (ea-842)

v1.60.1

3 months ago

1.60.1 - 2024-02-09

Added

  • Rule syntax: Metavariables by the name of $_ are now anonymous, meaning that they do not unify within a single pattern or across patterns, and essentially just unconditionally specify some expression.

    For instance, the pattern foo($_, $_) may match the code foo(1, 2).

    This will change the behavior of existing rules that use the metavariable $_, if they rely on unification still happening. This can be fixed by simply giving the metavariable a real name like $A. (ea-837)

  • Added infrastructure for semgrep supply chain in semgrep-core. Not fully functional yet. (ssc-port)

Changed

  • Dataflow: Simplified the IL translation for Python with statements to let symbolic propagation assume that with foo() as x: ... entails x = foo(), so that e.g. Session().execute("...") matches:

    with Session() as s:
        s.execute("SELECT * from T") (CODE-6633)
    

Fixed

  • Output: Semgrep CLI now no longer sometimes interpolated metavariables twice, if the message that was substituted for a metavariable itself contained a valid metavariable to be interpolated (ea-838)

v1.60.0

3 months ago

1.60.0 - 2024-02-08

Added

  • Rule syntax: Metavariables by the name of $_ are now anonymous, meaning that they do not unify within a single pattern or across patterns, and essentially just unconditionally specify some expression.

    For instance, the pattern foo($_, $_) may match the code foo(1, 2).

    This will change the behavior of existing rules that use the metavariable $_, if they rely on unification still happening. This can be fixed by simply giving the metavariable a real name like $A. (ea-837)

  • Added infrastructure for semgrep supply chain in semgrep-core. Not fully functional yet. (ssc-port)

Fixed

  • Output: Semgrep CLI now no longer sometimes interpolated metavariables twice, if the message that was substituted for a metavariable itself contained a valid metavariable to be interpolated (ea-838)

v1.59.1

3 months ago

1.59.1 - 2024-02-02

Added

  • taint-mode: Pro: Semgrep can now track taint via static class fields and global variables, such as in the following example:

    static char* x;
    
    void foo() {
        x = "tainted";
    }
    
    void bar() {
        sink(x);
    }
    
    void main() {
        foo();
        bar();
    }
    ``` (pa-3378)
    
    
    

Fixed

  • Pro: Make inter-file analysis more tolerant to small bugs, resorting to graceful degradation and continuing with the scan, rather than crashing. (pa-3387)

v1.59.0

3 months ago

1.59.0 - 2024-01-30

Added

  • Swift: Now supports typed metavariables, such as ($X : ty). (pa-3370)

Changed

  • Add Elixir to Pro languages list in help information. (gh-9609)

  • Removed sg alias to avoid naming conflicts with the shadow-utils sg command for Linux systems. (gh-9642)

  • Prevent unnecessary computation when running scans without verbose logging enabled (gh-9661)

  • Deprecated option taint_match_on introduced in 1.51.0, it is being renamed to taint_focus_on. Note that taint_match_on was experimental, and taint_focus_on is experimental too. Option taint_match_on will continue to work but it will be completely removed at some point after 1.63.0. (pa-3272)

  • Added information on product-related flags to help output, especially for Semgrep Secrets. (pa-3383)

  • taint-mode: Improve inference of best matches for exact-sources, exact-sanitizers, and sinks. Now we also avoid FPs in cases such as:

    dangerouslySetInnerHTML = {
      // ok:
      {__html: props ? DOMPurify.sanitize(props.text) : ''} // no more FPs!
    }
    

    where props is tainted and the sink specification is:

    patterns:
      - pattern: |
         dangerouslySetInnerHTML={{__html: $X}}
      - focus-metavariable: $X
    

    Previously Semgrep wrongly considered the individual subexpressions of the conditional as sinks, including the props in props ? ..., thus producing a false positive. Now it will only consider the conditional expression as a whole as the sink. (rules-6457)

  • Removed an internal legacy syntax for secrets rules (mode: semgrep_internal_postprocessor). (scrt-320)

Fixed

  • Autofix: Fixes that span multiple lines will now try to align inserted fixed lines with each other. (gh-3070)

  • Matching: Try blocks with catch clauses can now match try blocks that have extraneous catch clauses, as long as it matches a subset. For instance, the pattern

    try:
      ...
    catch A:
      ...
    

    can now match

    try:
      ...
    catch A:
      ...
    catch B:
      ...
    ``` (gh-3362)
    
  • Previously, some people got the error:

    Encountered error when running rules: Other syntax error at line NO FILE INFO YET:-1:
    Invalid_argument: String.sub / Bytes.sub
    

    Semgrep should now report this error properly with a file name and line number and handle it gracefully. (gh-9628)

  • Fixed Dockerfile parsing bug where multiline comments were parsed incorrectly. (gh-9628-2)

  • The language server will now properly respect findings that have been ignored via the app (lsp-fingerprints)

  • taint-mode: Pro: Semgrep will now propagate taint via instance variables when calling methods within the same class, making this example work:

    class Test {
    
      private String str;
    
      public setStr() {
        this.str = "tainted";
      }
    
      public useStr() {
        //ruleid: test
        sink(this.str);
      }
    
      public test() {
        setStr();
        useStr();
      }
    
    }
    ``` (pa-3372)
    
  • taint-mode: Pro: Taint traces will now reflect when taint is propagated via class fields, such as in this example:

    class Test {
    
      private String str;
    
      public setStr() {
        this.str = "tainted";
      }
    
      public useStr() {
        //ruleid: test
        sink(this.str);
      }
    
      public test() {
        setStr();
        useStr();
      }
    
    }
    

    Previously Semgrep will report that taint originated at this.str = "tainted", but it would not tell you how the control flow got there. Now the taint trace will indicate that we get there by calling setStr() inside test(). (pa-3373)

  • Addressed an issue related to matching top-level identifiers with meta-variable qualified patterns in C++, such as matching ::foo with ::$A::$B. This problem was specific to Pro Engine-enabled scans. (pa-3375)

v1.58.0

3 months ago

1.58.0 - 2024-01-23

Added

  • Added a severity icon (e.g. "❯❯❱") and corresponding color to our CLI text output for findings of known severity. (grow-97)

  • Naming has better support for if statements. In particular, for languages with block scope, shadowed variables inside if-else blocks that are tainted won't "leak" outside of those blocks.

    This helps with features related to naming, such as tainting.

    For example, previously in Go, the x in sink(x) will report that x is tainted, even though the x that is tainted is the one inside the scope of the if block.

    func f() {
      x := "safe";
      if (c) {
        x := "tainted";
      }
      // x should not be tainted
      sink(x);
    }
    

    This is now fixed. (pa-3185)

  • OSemgrep can now scan remote git repositories. Pass --experimental --pro --remote http[s]://<website>/.../<repo>.git to use this feature (pa-remote)

Changed

  • Rules stored under an "hidden" directory (e.g., dir/.hidden/myrule.yml) are now processed when using --config . We used to skip dot files under dir, but keeping rules/.semgrep.yml, but not path/.github/foo.yml, but keeping src/.semgrep/bad_pattern.yml but not ./.pre-commit-config.yaml, ... This was mainly because we used to fetch rules from ~/.semgrep/ implicitely when --config was not given, but this feature was removed, so now we can keep it simple. (hidden_rules)
  • Removed support for writing rules using jsonnet. This feature will be restored once we finish the port to OCaml of the semgrep CLI. (jsonnet)
  • The primitive object construct expression will no longer match the new expression pattern. For example, the pattern new $TYPE will now only match new int, not int(). (pa-3336)
  • The placement new expression will no longer match the new expression without placement. For instance, the pattern new ($STORAGE) $TYPE will now only match new (storage) int and not new int. (pa-3338)

Fixed

  • Java: You can now use metavariable ellipses properly in function arguments, as statements, and as expressions.

    For instance, you may write the pattern

    public $F($...ARGS) { ... }
    ``` (gh-9260)
    
  • Nosemgrep: Fixed a bug where Semgrep would err upon reading a nosemgrep comment with multiple rule IDs. (gh-9463)

  • Fixed bugs in gitignore/semgrepignore globbing implementation affecting --experimental. (gh-9544)

  • Fixed rule IDs, descriptions, findings, and autofix text not wrapping as expected. Use newline instead of horiziontal separator for findings with a shared file but for different rules per design spec. (grow-97)

  • Keep track of the origin of return; statements in the dataflow IL so that recently added (Pro-only) at-exit: true sinks work properly on them. (pa-3337)

  • C++: Improve translation of delete expressions to the dataflow IL so that recently added (Pro-only) at-exit: true sinks work on them. Previously delete expression at "exit" positions were not being properly recognized as such. (pa-3339)

  • cli: fix python runtime error with 0 width wrapped printing (pa-3366)

  • Fixed a bug where Gemfile.lock files with multiple GEM sections would not be parsed correctly. (sc-1230)