What is a tool to get all pages for a site from Google?

What is a tool to get all pages for a site from Google?
So pages which are currently indexed, ie. not with a spider and not with dirbuster.

TIA

Comments

  • I believe what you're referring to is "Google dorking"? Where you use search operators such as "filetype:pdf" to filter google results? You might have to go into more detail if you want a quick and easy answer.

    Feel free to PM me, but please ask good questions: https://www.shorturl.at/fmAX6

  • edited September 2020
    What about sitemap.xml ?
    But it mostly work for content/blogging sites only.

    A Chemist doing Penetration Testing - Check the Story here: BinaryBiceps

  • Type your comment> @gunroot said:

    What about sitemap.xml ?
    But it mostly work for content/blogging sites only.

    Maybe he's talking about robots.txt?

    Feel free to PM me, but please ask good questions: https://www.shorturl.at/fmAX6

  • @PapyrusTheGuru
    > Maybe he's talking about robots.txt?

    SEO uses sitemap.xml and it's content to index the pages over search engines. It may help in some cases to find all the pages of a site.

    A Chemist doing Penetration Testing - Check the Story here: BinaryBiceps

  • Yes, basically google dorking. How could I save all the urls to a file? ex. if I search for site:mydomain.com

  • edited October 2020
    I believe the term you are looking for is 'web scraping': the automated process of retrieving data of whatever website (in this case google).
    Just know that Google doesn't like you doing that: it's in fact against their policies... which is ironic for a service that makes a living on webscraping other sites, but there you go.

    Anyway, that doesn't mean you can't crawl Google results, it just means that if you fire about 150 requests at a rate of about 2 per second, you will get temporary banned from Google.

    Their suggested way of doing it, is paying for their api service.

    On your original question and how to do it:
    If you look for 'python web scraping', you'll get a bunch of good and easy to follow guides.
    There are also a multitude of tools that can scrape sites, from automation programs like 'automation anywhere' to dedicated software for web crawling. Google around, it's a pretty common task with many solutions for it.

    What tools to use:
    It kinda depends on what you want to do with the results:
    If you want full control over the results (with the drawback of being a lot of work): go with python, or whatever other language suits you.
    If you want to get up and running fast and just have a file containing all results, you're better off with ready made software... just a faster path to that goal.

    Best of luck!
Sign In to comment.