Thursday, 13 October, 2022 UTC


Summary

Snyk Code supports various languages important in the cloud native arena, Ruby being among them (and we’ve seen great adoption, so thank you!).  Our researchers are constantly monitoring our rule sets, using our training set of open source projects, but also — and, yes this is an advantage of a SaaS service — how the rules do on the code that is scanned. Just as a reminder, Snyk does not use your code to train our sets — but we do aggregate usage statistics.
Within the Ruby data, we saw an outlier (which we also got feedback from users about):
Relative HardcodedEmail rule findings in relation to the others.

Why we removed the HardcodedEmail rule for Ruby
We had a rule that was looking for email addresses within Ruby code and flagging them as a possible low priority issue. While it is still a best practice to not store hardcoded emails in code, we saw that this rule produced issues that were often ignored or not addressed. We looked deeper into the numbers and it seemed that the noise generated versus the recommendation usage was in no way related, so we decided to remove the rule. We have other rules using data flow to find misused credentials or passwords, so we are still very much covered from that perspective. 
What does this mean for you? Well, if you are using Ruby, expect a reduction on low priority findings based on email addresses soon.
What is the takeaway beyond the Ruby rule?
Every static application security testing (SAST) tool — and static code analysis tool more generally — needs to keep a balance between soundness (identifying only true positive issues, but in doing so missing some) and completeness (identifying any possible issue but producing lots of noise with false positives). With Snyk Code’s speed, we have the enormously helpful situation wherein we can make use of 120 to 150 thousand projects per language as a test set to help try to keep this soundness v. completeness balance.
On top, we have usage data from the Snyk app. With this, we are originally optimizing the system and continue to optimize. This has several reasons:
  • The environment constantly changes: JavaScript is very famous for having a new fashionable library every other week. But jokes aside, every language environment constantly adds new or changes existing libraries. We are using machine learning to automatically learn changes and add them to our knowledge base.
  • Some best practices are not really used: While there is the best practice not to include hardcoded emails in code, we have to accept that the reality might differ and there are good reasons to divert.
  • New issue types emerge: Remember Trojan Source? This was a totally new type of issue never seen before. Within a few days, the Snyk Code team was about to provide rules for this — in every supported programming language that had a high accuracy in findings.
Thank you for your feedback and suggestions on how to improve the product. Next to data-driven decisions, it is a valuable source of information for us. We listen and react. If you haven’t used Snyk Code, maybe that is a good point to start. Just sign up for the free Snyk account and get results in seconds.
Secure your Ruby projects for free
Secure your Ruby code, open source dependencies, containers, and cloud infrastructure with Snyk.
Sign up for free