In advance of the D-Day landings, the Allies needed ground-level pictures of the Normandy beaches without giving away the planned landing site to the Nazis. In addition to reconnaissance images, planners enlisted the BBC and held a vacation photo contest, which netted 10 million photos! Those unsuspecting civilians provided images that helped create a detailed look at the terrain the invaders would face as they stormed ashore.
Imagine the challenges faced as organizers sifted through those millions of photos.
- How many people were needed to sort through those photos? How long did it take?
- How do you begin to accurately sort content submitted from so many sources in various formats?
- Could they verify a photo labeled Calais wasn’t really Cherbourg?
Machine Learning to the RescueThankfully, we have modern computing technology. Applying machine learning to government missions accelerates existing tasks and makes feasible previously unachievable tasks. For example, a model can be trained to identify people who appear on security cameras and alert personnel to the presence of flagged individuals at secure facilities, dramatically improving security at relatively little cost. Or an algorithm can automatically summarize sections of a document, decreasing the time burden of reading through pages of briefs searching for minute details. In each of those examples, machine learning accelerates and augments the work of existing personnel and alleviates the need to hire additional employees to achieve time-constrained objectives.
Messy DataWhile machine learning provides many improvements, we need to be aware of the limitations of artificial intelligence (AI) and other predictive solutions. High-quality machine learning solutions are largely dependent on high-quality data. While the amount of data available has increased, steps must be taken to ensure that data is readily usable for analysis. Structuring data as it is created and stored greatly reduces development costs in analysis – frequently the most time spent on data science projects involves “cleaning” (organizing) messy data. Imagine sorting those Normandy photos if none were labeled!
The Computer Said SoAlso pertinent is the common tradeoff between the accuracy of a machine learning model and the interpretability of the process it uses to produce results. Many of the models that provide the “best” results offer no practical interpretability of their conclusions. Decision makers cannot rely on the conclusions of “black box” models when weighing actions with serious consequences regarding national security. In those cases, selecting a solution with slightly lower accuracy but which maintains interpretability allows human experts to analyze and validate results, and present actionable conclusions. We can’t use “the computer said so” as a viable justification to storm the beaches…we want to know the data sets used, their trustworthiness, and any assumptions that were made along the way.
Gaining a Competitive EdgeAt Engility, we approach data science solutions with the mission as the central focus. Keeping in mind the requirements of each customer’s unique problem set, we build and tailor unique machine learning solutions, iterating on possible approaches with a “fail fast” mentality to quickly come to the most suitable solution for the customer. It’s necessary for federal government organizations to keep pace technologically with the commercial sector, operating efficiently and maintaining a leadership role worldwide. For this reason, we incorporate and innovate cutting-edge techniques in data science, focusing on tailoring best-in-class techniques to suit the context of federal government missions. Continually developing new AI and machine learning methodologies to address issues of national security will help the U.S. maintain its competitive edge globally.
We have introduced these concepts into our MetaSift and Synthetic Analyst solutions. These tools can do in minutes what took our WWII counterparts months to complete.