The White House Memory Safety Appeal is a Security Red Herring

In the Holy Programming Language Wars, the lingua franca of system programming – also known as C – is often lambasted for being unsecure, error-prone, and plagued with more types of behavior that are undefined than ones that are defined by the C standards. Many programming languages were said to be ‘C killers’, yet C is still alive today. That didn’t stop the US White House’s Office of the National Cyber Director (ONCD) from putting out a report in which both C and C++ got lambasted for being ‘unsafe’ when it came to memory management.

The full report (PDF) is pretty light on technical details, while citing only blog posts by Microsoft and Google as its ‘expert sources’. The claim that memory safety issues are the primary cause of CVEs is not substantiated, or at least ignores the severity of CVEs when looking at the CISA statistics for active exploits. Beyond this call for ‘memory safety’, the report then goes on to effectively call for more testing and validation, while kicking in doors that were opened back in the 1970s already with the Steelman requirements and the High Order Language Working Group (HOLWG) of 1975.

What truly is the impact and factual basis of the ONCD report?

CVE Quality Not Quantity

Perhaps the most vexing of the claims made repeatedly in the ONCD report – as well as the longer, but very similar report by the NSA, CISA and others titled The Case for Memory Safe Roadmaps – is that of memory safety issues being the primary issue. These are claims which seem to always come back to reports by Microsoft and Google, rather than the list of actively exploited CVEs, all of which feature prominently in e.g. the 2023 report on 2022’s top 12 hit list with everyone’s favorite vulnerabilities, such as Log4j (CVE-2021-44228) featuring sloppy input validation, or three CVEs in Microsoft’s Exchange Server, hitting a triple whammy of Common Weakness Enumerations (CWEs).

Just like 2022’s chart leader (Fortinet SSL VPN) this includes CWE-22: the improper limitation of a pathname to a restricted directory. Exchange Server was also featured for CWE-918 (server-side request forgery, SSRF) and CWE-287 (Improper Authentication). Of these, memory safety issues can be a factor with CWE-287 (e.g. CVE-2021-35395), albeit very sporadically. The pattern with especially remote exploits (which is relevant with ‘cybersecurity’) is overwhelmingly with input validation and handling, which mostly involve omitted checks and logic errors.

Putting the focus on memory safety is more than a little suspect when the worst CVEs come from programmers not putting in basic checks for path traversal and forgetting to fully check user credentials. What is also worrying is the complete lack of any reference to the favorite language of the military, medical, and aviation fields where things going boom (prematurely) is generally considered a bad thing: Ada.

Steelman

As mentioned earlier, the Steelman requirements are the result of the foremost computer science experts at the time being asked to come up with the requirements that a high-level language would have to fulfill in order to be used for the most demanding tasks across the US DoD. In a 1997 comparison by David A. Wheeler of Ada 95, C89, C++, and Java, these languages’ adherence to the Steelman requirements is determined, with Ada obviously scoring the highest (93%), while Java comes in at 72%, C++ 68% and C at the back with 53%. Of note here is of course that since then all of these languages have received many updates to their respective standards, but it still provides a useful snapshot.

The lack of built-in concurrency support in C and C++ has been partially resolved at this point, with C++11’s standard memory model, but is hard to fully resolve without modifying the language’s foundations and rendering it incompatible with existing codebases. This is something that you can only do with something less fundamental like a scripting language, and even then it’s likely to upset a large part of the userbase for many years.

Where Ada scores very highly is not only with its concurrency handling, but also with its type system, which includes aspects such as parameters and return values. What often upsets novice Ada programmers who migrate from other languages is that they first have to set up their types with constraints, and this can seem time-consuming and unneeded. But as described in Steelman, these restriction, along with code that’s stripped from as much ambiguity as possible, help to avoid programming mistakes.

Effectively, a good programming language knows what your intent is by setting restraints and offering the means to restrict functions and procedures using contract-based programming so that the compiler has as much context as possible. Meanwhile, the code should be written where possible in plain English, without cryptic symbols and easy to typo symbols that can e.g. turn a comparison into an assignment. This is also the reason why Ada is case-insensitive: why would coolVar differ from coolvar when it’s clear from the context what is meant?

Memory Safety Is Easy In C++

It’s rather amusing to read old DoD reports, such as a 1991 report (PDF) by the US Air Force called Ada and C++: A Business Case Analysis. This was written before C++ was standardized, but as part of a ‘make stuff cheaper to build’ push by the DoD, C++’s claim to have tacked many of Ada’s features onto C got investigated by multiple government branches, including the FAA. The conclusion then was that Ada was still by far the best choice, with clear signs that its strong code reusability and self-documenting code helped reduce maintenance costs.

Even so, Ada’s lack of popularity outside of the aforementioned fields has led to a dearth of Ada programmers, which has resulted in C++ ultimately being approved for DoD projects like the F-35 program, albeit with strong restrictions on acceptable code to bring it more into line with Ada. What this shows is perhaps that the problem is not so much C++, but more how you use it.

After all, C++ by itself has no major issues with memory management or a lot of undefined behavior as long as you keep away from its C compatibility syntax. With RAII (resource allocation is initialization) and encapsulating code into classes with well-tested constructors and destructors, you avoid many of the issues that plague C. Add to this C++’s standard template library (STL) with std::string and containers that replace the nightmarish and error-prone C-style strings and arrays, and suddenly you have to try pretty hard to get code that has any memory-related issues, because simple buffer overruns no longer happen.

Of course, many who do programming for a while will be tempted by low-level optimizations, and end up writing things like a lock-free ring buffer and zero-copy RPC libraries using raw memory pointers. In such cases you will want to first of all have a solid understanding of how the underlying hardware works, and get really familiar with tools like Valgrind, The value of Valgrind in particular is hard to overestimate, as with a bit of effort you can analyze your code for memory safety, multi-threading issues (threads, mutexes, etc.) as well as memory usage.

Touch Of Magic

Perhaps in this era of instant-gratification LLM code generating tools and “cut/paste from Stack Overflow”, we have forgotten the most important thing of all about programming. Namely that it is an engineering discipline that requires planning, documenting, testing, and feeling like the more you learn, the less you know, and the more easy mistakes you make as you write more complicated programs. Speaking as someone who is gradually porting personal C++ projects to Ada, what I have found along the way is that as much as I like C++, there’s something about Ada that really excites me.

Sure, it is a bit of a pain to get used to dealing with the default immutable string types, and it’s all too easy to just reach for the package of predefined integer and float types to get that instant gratification, not to mention the hours staring at the Ada compiler output as it informs you of all the issues in your code. Yet once you’re over those first hurdles, and the program just runs without glitches or oddities like you’re used to with C++ code, that’s an almost magical feeling.

This is perhaps why the ONCD report feels so wrong, as it contains none of the lessons of the past, nor the hard-won experiences of those who write the code that keeps much of society up and running. You can almost hear the cries of many senior software engineers as they wonder whether they’re merely chopped liver in the eyes of government organizations, even as said organizations are kept running due to countless lines of Ada, COBOL, C and C++ code. Never mind the security researchers who despair as basic input validation is once again ignored in favor of buzzwords pushed by a couple large corporations.



from Blog – Hackaday https://ift.tt/WlwpCQc

Comments

Popular posts from this blog

Modern Radio Receiver Architecture: From Regenerative to Direct Conversion

Hackaday Links: May 31, 2020

Homebrew 68K Micro-ATX Computer Runs Its Own OS