The New Yorker has a fascinating article about technology advances being made to un-redact classified text from government documents.
Typically, classified material is redacted from disclosed documents with black bars that are technologically “burnt” into the document.
With the black bars, you are not supposed to be able to see/read what is behind it because of the sensitivity of it.
But what if our adversaries have the technology to un-redact or un-burn and autocomplete the words behind those black lines and see what it actually says underneath?
Our secrets would be exposed! Our sensitive assets put at jeopardy!
Already a Columbia University professor is working on a Declassification Engine that uses machine learning and natural language processing to determine semantic patterns that could give the ability “to predict content of redacted text” based on the words and context around them.
In the case, declassified information in the document is used in aggregate to “piece together” or uncover the material that is blacked out.
In another case prior, a doctoral candidate at Dublin City University in 2004, used “document-analysis technologies” to decrypt critical information related to 9/11.
This was done by also using syntax or structure and estimating the size of the word blacked out and then using automation to run through dictionary words to see if it would fit along with another “dictionary-reading program” to filter the result set to the likely missing word(s).
The point here is that with the right technology redacted text can be un-redacted.
Will our adversaries (or even allies) soon be able to do this, or perhaps, someone out there has already cracked this nut and our secrets are revealed?
(Source Photo: here with attribution to Newspaper Club)