Large number of files - how to find the needle in the haystack

HI all, Just for any suggestions that any of you may have on what is the best way to find information that is of value when you are faces with a large number of files?

I am working on one of the boxes and have managed to locate a 75mb backup of a system. I’ll be damned if I am going to search through every file looking for a plaintext username and password.

Thanks

I dont think there is a one-size-fits-all answer as everything depends on the situation and other information you can gather during enumerations.

For example, if you think the service runs MySQL, you can use find and/or grep to hunt for things of specific interest. Its not perfect but being able to narrow your area of interest down to a single file extension certainly helps.

If all else fails, you’d be surprised at how fast find and grep can be.

For example grep -i password * -r ./ will carve through a lot of data really, really quickly (just be prepared for a HUGE number of false positives).

As an alternative, you can use some find and other tools to narrow the selection down, just be prepared to be methodological. One example is looking for config files - they often contain credentials - and find -iname becomes useful.

It is definitely a good skill to practice as during pentests/exploitations of live systems you are likely to be faced with 250+GB of data you need to parse to find useful loot.

Type your comment> @TazWake said:

I dont think there is a one-size-fits-all answer…
If all else fails, you’d be surprised at how fast find and grep can be.

As an alternative, you can use some find and other tools to narrow the selection down, just be prepared to be methodological.

Completely agree with what TazWake says: use your enumeration to narrow and target. The last thing you want is to spend hours of waiting for results to find out you missed the directory you were looking for. At the same time, it suxx waiting for hours to search everything… It just suxx less.
Bottom line: enumerate and narrow down.

Here’s two commands I constructed over the time. It may give you a head start:

find / -iname "*.ovpn" -o -iname ".ssh" -o -iname "app.js" -o -iname "id_rsa*" -o -iname "db_connect" -o -iname "db_config.php" -o -iname "database_settings*" -o -iname "amportal.conf" -o -iname "tomcat-users.xml" -o -iname "config.php" -o -iname "db.php" -o -iname ".config"  -o -iname "configuration.php" -o -iname ".secret" -o -iname ".passwd" -o -iname "wp-config.php" 2> /dev/null

Find is pretty fast, but pretty much takes standard configurations to have any luck.

grep -RiIE --exclude-dir "usr/share/doc" --exclude-dir "/<another_directory_to_exclude>" "(BEGIN\ RSA\ PRIVATE\ KEY|auth-user-pass|\[users\]|user=|AMPDBUSER=|AMPMGRUSER=|\$DBUSER|\$DBPASS|dbPassword|db_passwd|DB_USER|DB_USERNAME|DB_PASSWORD|dbUserName|mongodb://.*:.*@|mysql_connect|mysqli_connect|</User>|</Pass>|\$password\ =|\$user\ =)" / 2> /dev/null

No matter how fast grep is, it can take forever to finish. Make use of the ‘<another_directory_to_exclude>’ and maybe change the “/ 2> /dev/null” in the end to say “/var 2> /dev/null” or whatever directory you suspect has the highest chance of finding anything. Searching ‘/proc’ for example sounds like a bad idea to me for example… Well… As far as my knowledge goes.

Hopes this helps.
Best of luck man

P. S. I thought we could use markdown in our posts, but it doesn’t seem to work… I normally don’t have smiley faces in my commands…

Good advice from @gnothiseauton there and, @NeoCortex2000 if you are on the box I think you are on, there is a command there which will absolutely work for you.

It’s also worth looking into a bit more “advanced” command line use - for example you can use -exec with find to get it to execute secondary commands such as cat if you want to check the contents of files you’ve found by name alone (for example you could build something which finds all files named config and then searches through their contents to look for a specific string you want to find).

It can be daunting when faced with a lot of data but realistically the command line tools parse this super fast.

If you had to manually traverse every folder, open each file and read the contents it would kill you. But you dont.

@TazWake pretty sure I didn’t do the box he’s on yet. A bit of luck goes a long way I guess…

True what you say about chaining and the -exec.
Lately I tend to move towards building readable lists and then processing from there. It’s visually a lot easier to grasp and review what I did and what I may have missed. Anyway no matter where you start from, chaining commands is the way to go indeed. Human eyes are just not build for processing so much data if you ask me.
I sometimes even read over a password even if I know it’s there, let alone if we are talking megs of data…

@gnothiseauton said:

@TazWake pretty sure I didn’t do the box he’s on yet. A bit of luck goes a long way I guess…

Rather than luck, I think this just shows your enumeration technique works.