bulk_extractor 0.3.2 released

May 25, 2010

bulk_extractor 0.3.1 is released. This version has several new
features based on user-feedback, and a few bug fixes based on a
thorough code review.

New Features:

  • url_services.txt – a histogram of all URLs by domain.

  • url_searches.txt – a histogram of all search terms, including Google, Yahoo, Bing, and any other search service with “search” in the domain and “q=” or “p=” in the URL.
  • ccn.txt – this file now reports Federal Express account numbers, SSNs (if properly formatted or prefixed), DOBs, and other info.
  • tcp.txt – This experimental feature looks for IP and TCP packets in PAGEFILE.SYS, memory dumps and hibernation files, and stores the results.
  • the whitelist and redlist files may now contain globbed terms. For example, put *@company.com in the redlist and any mention of anyone@company.com will be flagged and also put into a special file called redlist_found.txt.
  • CONTEXT: The ccn.txt now show the context from which the matched
    information was taken. hosts.txt shows context for numeric IP addresses.

Bug Fixes:

  • Improved handling of raw devices and files.

  • bulk_extractor is now less likely to error on some input data sets.
  • A crashing bug that impacted bulk_extractor 0.3.1 has been addressed.

Filed under: Uncategorized

Tags: ,

Leave a Comment

(required)

(required), (Hidden)

 

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

TrackBack URL  |  RSS feed for comments on this post.


Pages

Blogroll

Downloads

Meta

Tags