There is only so much you can share in a talk, and so I’ve decided to turn a short 50 minutes into a rightfully lengthy series. I know this post is long, but I kindly ask you bear with me. We will revisit the topics discussed in this post repeatedly throughout our series– so it’s best to establish some basis and familiarity with them now.
Indeed, it makes little sense to jump into the technical meat and potatoes without first defining the words, processes, and concepts to evaluate the work ahead of us. This post, after all, serves as our guide to establish goals and valid measurements on if we are successful or not.
In the compiled binary world, reverse engineering is the taking of an application (executable or dll) and leveraging a combination of compiler theory and assembly to arrive at a reasonable representation of its original source code. And though, while this definition is accurate, it’s somewhat mechanical and not exactly very revealing. We are going to define reverse engineering as:
The art of deducing an application’s elements, composition, behaviors, and relationships.
This definition is more functional as it establishes the goals of our process. Why someone might reverse engineer an application is variable of intent, and to a degree irrelevant to the conceptual goals. That is, people reverse engineer for a variety of reasons including, interoperability, general education, and security testing. Although each of these reasons dictate unique attentions and focuses, our conceptual goals still stand. In our case, “why” we reverse engineering applications is predicated around the belief that security is a visibility problem.
In the ideal world, every engagement would grant me source code access and a copy of the application/environment*. Having 100% visibility into the static and dynamic environment of an application is incredibly powerful. By its nature, it eliminates the need for guessing and will make attacks significantly more informed and reliable. Simply put, a better job can be done because this is a position of advantage. It serves to reason then, that in all situations less than the ideal, we must reverse engineer to get into that position.
The Process of Information Gathering
Now, if you’ve been around the block, you might note that few (if any) in the appsec industry use this lingo. In its stead, you will hear about information gathering and in some cases even analysis. The Web Application Hackers Handbook (WAHH) uses this combined definition as the entry point to any web security test. While I believe the track they are on is correct in a sense, I’d dare suggest the picture painted is inaccurate.
Traditional information gathering, as defined by OWASP, WAHH, and many others, is ubiquitously listed first step in the hierarchy of checklist style web-testing. The laundry list of tasks it outlines include:
- mapping visible content
- identifying non-visible content (forced browsing, search engine discovery)
- testing for debug parameters
- identifying data entry points
- identifying the technologies used in the application
- mapping the actual attack surface
- analyzing exceptions
These tasks are further broken down into numerous sub-tasks and subtle implications, such as testing with various browsers, extracting a list of every parameter used in the site, gathering comments, and so on and so on. Though perhaps not as apparent up front, these tasks create a huge amount of upfront work if you were to follow them in the literal definition.
Which is why basically no one tests like that.
Lets presume for a moment that you had the time to do all this. This is not a simple presumption, mind you, as the time this takes is exponential to size of the application. But lets say you did. Related to the tasks and goals we defined with reverse engineering what have you learned? You’ve collected a series of facts about the application, but you are realistically in no better a situation than when you started. Gathering a list of every parameter in a site doesn’t make you more situated to test any of them in a relevant way. You still lack context and understanding (more so if you used automation to achieve this).
And did I mention it’s slow? As the surface area of an application grows, especially its dynamic surface area, so does the amount of information possible that you can collect. Take just one aspect of the discovering of non-visible content, for instance. While there are only a few approaches, most commonly this is performed with tools such as dir-buster and burp’s discovery tools**. These tools throw mutations and variants at the site based on content previously learned about. This approach sounds good, but grows very very quickly and in most cases never actually finishes on anything except the most trivial of sites.
So. It is fair to say that some types of information collected (and some collection methods) are far more valuable than others. In most actual instances of reverse engineering (not for web), it’s rare one would try to collect “everything” about an application. More common would be to understand and evaluate a specific item of concern or to unpack a behavior. The task of information gathering and analysis (in our case reversing) is only valuable in its ability to drive us forward towards a goal. We do not collect information for information’s sake.
Instead, when asked to describe their methodology, the most common answer I hear*** is somewhat nebulous and uncomfortable. It’s often described as this:
I use the application/system like a normal user would, and follow leads and play with interesting things as they come up. I keep going until I feel as though I’ve hit all the important stuff.
Yikes. Admittedly, that not a very well-defined process– and surely not something that can be taught to others in its current state. Luckily, however, there are indeed ways to unpack the gems hidden in that statement.
The Art of Reverse Engineering Web Applications
The first thing to note of the aforementioned description is that the process is best understood as iterative, not hierarchical. The application is revisited, over and over, and as new information is discovered absorbed into our understanding and acted against when it’s determined to be valuable. The deciding of what to test is both natural and dynamic. In contrast, hierarchical testing implies you gather a bunch of details once and move on. Waterfall, as a development methodology, has fallen out of favor to build software– so why would we test it that?
For me, understanding this iterative process hit home when I studied a bit on John Boyd. Boyd was a modern military strategist in the Air force who is perhaps best known for his work with maneuverability warfare and OODA (Observe, Orient, Decide, Act) loop. The OODA loop provides interesting insight into our natural method of processing information and deciding/acting against it. It proposes that we take in information and evaluate it for its worth based on our history, emotion, bias, and cultural experiences. Once evaluated, we decide what we do with it and act against it. This loop is constant, generally subconscious, and usually very quick (some loops are longer than others). You may not make a decision to operate this way, but I believe you do regardless.
The implication in this type of testing (and further hinted at in our description) is that the tester must rely heavily on their ability to see patterns and deviations of those patterns. This places a premium on having exposure to a wide variety of patterns and practices, such that they can be observed and oriented to appropriately. While, it’s theoretically possible the first time you test MVC pattern based sites you’d discern its inner working and details, it is unlikely. It is about as likely as writing a masterfully composed song the first time you learn to play the guitar. Possible, but not likely. As such we will spend considerable time discussing patterns for web applications in later posts.
Finally, the aforementioned process definition forces us to face the most common rebuke I get when sharing this approach– “how do you know when you are done?”, which is usually coupled with an expression of desire to be thorough and to ensure the client gets the best test. This question is a good one to ask, and not one asked enough.
To answer that question we have to strip away the illusion that any model could perfectly satiate the fear of being incomplete. They all are. We have no evidence to suggest that we are capable of finding and squashing all bugs (let alone security bugs) even when an application is put under numerous spot lights. The hierarchical model, in my opinion, exists exactly because of this fear– people like bookends. It fulfills a desire for a concrete beginning and ending to the test, but in exchange it steals away creativity and relationship between the tester and what an application has to say. Its like two people dancing next to each other, not with each other.
Instead– testing in a fashion reliant on past experiences, asking questions, and listening to the application is perhaps the best way to provide a thorough test. It allows the tester to deal with what IS going on with a site, while not trying to fit the site into a specific mold.
To be clear, though, the nebulous definition of methodology still sucks. I am not suggesting testing an application in a totally undirected fashion. I am merely pointing out that actual conversation with the application has a much greater potential to drive us deeper into the heart of what is going on.
Since the aforementioned definition is indeed nebulous, the approach we will review and work with is somewhere in between. It is clearly less formal than a hierarchical approach, yet more formal than “I just test the app.” It is a focused and iterative process in which each piece of the test drives us forward and continues to reveal even more of the puzzle. It is both active and passive**** as, in many cases, we can shortcut the guess-work through functional exploits to gain a deep visibility into the application’s composition.
Oh. So, how do I know when a test is over? When I say its over. Being a professional, reliant my past experiences and education puts me in a position to say that. Relying on someone else’s checklist does not. The rest of the series will revolve around unpacking what this all means. This process is neither comfortable, traditional, or yet complete. But for me, it’s made all the difference so far.
On a personal note, I’ve tried the hierarchical approach to testing applications through studying and following a wide variety of methodologies. In each case, I can say that inevitably I am left with a feeling of, quite frankly, boredom. Every test becomes the same, and the job becomes monotonous. I have embraced the approach I am outlining in this series because I’ve found that testing is a relationship. Applications are very honest, and if you can learn to ask intelligent questions, and how to listen to what they say, they will tell you a great deal about themselves.
* I also want access to the server if it’s hosted, a recreation of environment with total visibility.
** This is not a knock at these tools or their creators– only pointing out the shotgun approach bears limited fruit, especially compared to other more informed approaches.
*** My intent was not exactly scientific in this, as I did not send out a formal survey and such. I did, however, talk to a fair bit of experienced and notable testers about this issue.
**** Active and passive testing are also very weak terms. Lately I’ve been using the terms elicitation and interrogation. I will get into more detail on that later.