Skip to main content

OWASP Web Application Security Testing #1

 Information Gathering

***This blog is connected with one of my YouTube videos. There I have explained about OWASP and this list of OWASP. I'm recommending you to watch it before you read this Blog***

 
Here, we are understanding the deployed configuration of the server hosting the web application

OTG-INFO-001 : Conduct Search Engine Discovery/ Reconnaissance for Information Leakage 

There are direct and indirect elements for search engine discovery.

Direct Methods related to searching in indexes and the associated content from caches while Indirect Methods related to collecting sensitive design and configuration info by searching forums, newsgroups and tendering websites.

There are crawlers in search engines. Once these crawlers are completed crawling, it starts indexing the web page based on the tags and associated attributes. Such as <title>, <head> and so on. Crawlers are going through these indexes, in order to give us the relevant information.

Before Going forward....

1. What are web crawlers?
Web crawler is a bot who downloads and indexes content from all over the internet. These bots are almost operated by the Search Engines. By applying search algorithm to the data collected by the web crawlers, search engines can provide relevant links as the response for their user queries.

2. Why Crawlers?
To Learn what every webpage on the web is about, so the information requested by the user can be retrieved when it is required.

3. What is robots.txt file?
Robots.txt is a text file instruct web robots (typically search engine robots) how to crawl pages on their website. Basically, robots.txt file indicates whether certain user agents/web browsers can or cannot send their crawlers in certain parts of a website. Crawl instructions are specified by "allowing" and "disallowing" the behavior of certain user agents.

Syntax >>> User-agent: [user-agent name] Disallow: [URL string not to be crawled]

4. How to see the robots.txt file?

Syntax >>> https://<domain_name>/robots.txt

Following is an example



4. More about robots.txt file....
The robots.txt file is part of the the Robots Exclusion Protocol (REP). REP is a group if web standards that regulates how robots crawl the web, accessing and indexing content and serve that content to the users.

More reading about robots.txt >>> Click Here

If the robots.txt file is not updated during the time of the website, and inline HTML metatags that instruct robots not to index content have not been used, then it is possible to for indexes to contain web content not intended to be included in by the owners.

Website owners may use the previously mentioned robots.txt, HTML meta tags, authentication and tools provided by search engines to remove such content.

Test Objectives

Here we are understanding what sensitive design and configuration information of the application/system/organization is exposed both directly(On the organization's website) and indirectly(On third party websites)

What to test

We have to use search engines to search for,
*Network Diagrams and configurations
*Archived posts and emails by administrators and other key staff
*Log on procedures and username formats
*Usernames and passwords
*Error message content
*Development, test, UAT(User Acceptance Test) and staging versions of the website

Search Engines 

Do not limit testing to just one search engine provider as they may generate different results depending on when they crawled content and their own algorithms.

Consider following search engines
• Baidu
• binsearch.info
• Bing
• Duck Duck Go
• ixquick/Startpage
• Google
• Shodan
• PunkSpider

Duck Duck Go and ixquick/Startpage provide reduced information leakage about the tester.

PunkSpider is a web applicatin vulnerability search engine. It is of little use for penetration tester doing manual work. However, it can be useful as demonstration of easiness of finding vulnerablilities by script-kiddies

Google provides the Advanced “cache:” search operator [2], but this is the equivalent to clicking the “Cached” next to each Google Search Result. Hence, the use of the Advanced “site:” Search Operator and then clicking “Cached” is preferred.

Example :  site:owasp.org


This will retrieve all the websites under the domain of owasp.org

To Display the index,html of owasp.org as cached the syntax is, cache:owasp.org

Google Hacking Database

This is a list of useful search queries for google. Those queries can e put in to different categories.

• Footholds
• Files containing usernames
• Sensitive Directories
• Web Server Detection
• Vulnerable Files
• Vulnerable Servers
• Error Messages
• Files containing juicy info
• Files containing passwords
• Sensitive Online Shopping Info

What are the options/ Remediation

Carefully consider the sensitivity of design and configuration information before it is posted online.
Periodically review the sensitivity of existing design and configuration information that is posted online.

Read about advanced google search >>> Click Here

Trust me, Google Advanced search will help you to get the best out from google. It will be an additional advantage if you are using google daily.

***I will post the OTG-INFO-002 in next week. Keep in touch***

Thank You!!!

Comments

Popular posts from this blog

Installing Burp -Suite Professional v2020.12.1 in Windows 10

Note : This blog gives a walkthrough to install Burp-suite professional. So, I'm recommending to follow all the steps and the Screenshots given below. Downloading and configuring Java We need Java SE 15.0.2 for the installation. This does does not work with version 16 and up.  Go to the above provided link and download the java installer. It may tell you to create a free account.  Then Just create it and download. Java SE 15.0.2 -   https://www.oracle.com/java/technologies/javase/jdk15-archive-downloads.html I recommend to download installer directly. Then install it with default configurations(Do not change anything just press next button) After installing Java open command prompt. Then type 'java -version ' and press enter. As you can see my java version is 16.0.2.  Note : This does not gonna work with burp suite we are gonna install. So we have 2 options. We can either change the environment variables or we can remove the java version 16.0.2. In some cases c...

HTTP Protocol

-In order to have a clear idea about what we have learned in second year second semester subject; Web Security, I decided to make mind maps for each lecture. -I think it's better to share the knowledge we earn among others. -That's why I came up with an idea to create a blog about this module. -I'm putting every mind map I had drawn in both .png format and .pdf format.  -There can be issues, make sure to comment them.  -Then I would be able to make my mistakes and also the others who visit this blog. -Every mind map will be stored inside my google drive and I will make the shared hyperlink embedded to this blog. ➤➤➤To download above mind map in .png format (Click Here) ⮜ ⮜ ⮜ ➤➤➤To download above mind map in .pdf format (Click Here) ⮜ ⮜ ⮜ HTTP Headers is a huge topic and I did not want above map to be a messy one. So that I created following mind map for that.   ➤➤➤To download above mind map in .png format  (Click Here) ⮜ ⮜ ⮜ ➤➤➤To download above mind map in .pdf form...