Precision Tools for Cutting Through the Web's Clutter
Search like a surgeon: quickly, accurately, and reliably. Sharpen your search skills with expert techniques that dissect Google’s algorithm, ensuring you get only what you need.
A wise doctor once wrote that Google is not what it used to be. It used to be a decent search engine, but since approximately 2016 it mostly serves ads, curated content, or things Google wants you to see based on the immense amount of information it knows about you.
If you follow my posts you know that access to quality information is essential for informed decision-making in all aspects of your life. Nowadays, most valuable information is typically hidden behind a paywall, and no longer free. Surgeons need access to reliable, accurate, up-to-date information to inform their daily practice.
Today, I will teach you how to use Google as a pro. In the first part of the article, you will learn basic tools and techniques that will allow you to find what YOU want, not what the algorithms and advertisers want you to see.
In the second part, I will teach you how to “offensively” search on Google, however, I need to warn you:
Google knows who you are, what you do, and where you are. For this reason, and many others, I strongly advise you to use them with good intentions (or with very good protections). Be warned that not only you get in trouble with the law, but Google will also actively start to block your connection if you access the internet from a single static IP. Consider yourself warned.
Why still use Google?
Whether you are a doctor who needs life-saving information, a politician trying to understand your voters, a businessman trying to learn about competition, a hiring manager pre-vetting candidates, or a private detective - Google is simply hard to miss. While it is not what it used to be, it is still a powerful tool in capable hands.
Contrary to popular belief, Google is not free. You pay for using Google’s services with valuable information about yourself, your habits, your contacts, your family, your preferences, orientation, political views and over 10,000 other data points that Google will collect about you every time you use it.
Google is also not universally present all over the world. Notably China and Russia have their own alternative search engines:
Baidu - a leading Chinese search engine (>70% market share); shares a similar design to Google. It is heavily censored, especially with regard to political information. It requires you to make searches in Chinese. If you decide to try it - use translation services. Western names will likely be written in Chinese AND Latin alphabets.
Yandex is the most popular Russian search engine, likely indexing at least as many pages as Google, and occasionally you might even get better results than with Google, typically for results older than a few years, which are kept at Yandex, but often removed from Google. Yandex allows searches in English (or French), but the best results are only available if you search in Russian. Censored and monitors your searches just as Google does.
There are many other search engines available:
Bing (with much better mapping than Google, and with better video search; recently boosted with the 4th generation of ChatGPT)
DuckDuckGo - essentially Google, but without tracking (not entirely, but much safer than Google). Like Google it sells ads (first 1-3 results), however since it does not target users in the way Google does, it often returns way more accurate results, closer to your search query. It works like Google without tracking — which is why many people are often surprised that the results they get are not “what they are used to”. Strongly recommended, as it does not save your searches. It is also essential for searching the “Dark Web” (if connected to it via TOR) because it can index .onion sites.
The basics
It is estimated that search engines (like Google) offer access to approximately 10% of total web content, where 90% belongs to the “deep web” (not the same as the Dark Web).
Google uses a ranking system called PageRank1 to “rank” the results of a search of the site from what it considers “most relevant” to “least relevant”. Relevance is determined by multiple factors, such as:
how many other sites link to a site of interest
how often other people click on a given results
how many certain words and phrases are repeated in the content of the website
how much money someone has paid for you to see the search results
any many others
What is immediately apparent, is that to Google algorithms relevant = popular. This means that your default search queries are most likely to find what is popular, not what is relevant, accurate, true or necessary. For example: if you search for Donald Trump, you are likely to see pages about a pathological liar, not a plumber from Northumberland (if leaking pipes is the reason you went on Google).
That’s why the very first thing you need to learn is Boolean operators. Sounds scary, but it is quite simple.
Quotation marks (“ “)
Allow to search for a specific phrase (words will need to appear in the search results in the specific order you have provided).
Try searching for those two phrases and see which gives more specific results:
Very useful when searching for information about a specific person (such as an author of a study). Try: “first_name last_name” or “address post code city”. This was much more useful a few years ago, as today Google will provide similar (or the same) first 3-5 results. But for complex queries, it still works wonders.
Cached pages
Google creates a copy of every page it comes across and saves (caches) it. You can try and access those copies at any time. Very useful when the original page is not available, or you are looking for information that has changed over time. Google displays by default the most recently indexed page (the latest version).
Another use of cached pages is to extract email addresses from PDF documents.
How to use it? Add “cache:” followed by the address of your link. For example:
cache:https://bbc.co.uk/
If you are a victim of Google’s marketing and are using Chrome (or Chrome-based browser, such as Edge) you are NOT able to access cached pages. Google actively prevents you from doing it. Please make sure to use a decent browser, or find a Chrome plugin called “Web Cache Viewer”.
Cached pages are not available if the pages were not indexed in the past (or if the page owner requested the removal of cached content from Google indexing.
As of February 2024 google has completely shut down the cache search feature, but you can still use other browsers to find the information you need.2
Punctuation and capitalisation
Google is unable to distinguish between upper and lower case words, words with punctuation, hyphens, at (@) sign, etc. This is annoying, as it makes searching for e-mail addresses harder. More on this later.
Restricting searches to a specific site
If you want to search only the content of one domain or server use the “site:” prefix:
site:https://bbc.co.uk Sunak
This will return all indexed pages on the BBC portal that have the surname of the British Prime Minister mentioned on them. This is very handy when searching large sites, government resources, scientific content, press archives, or if you want to find information on your competition.
OR operator (OR)
Using OR between words or expressions will bring you all the links with one or the other phrase (only one needs to be present in the results).
Very helpful when saving time, as you don’t have to run multiple searches. For example, the following query will help you search for e-mail addresses of people or companies in the Leeds area with a LinkedIn profile:
leeds gmail OR hotmail OR “bbc.co.uk” OR “btinternet.com” site:www.linkedin.com
Another good example is to use OR when you are not sure what the correct spelling of a name or place is. You can also use to search for phone numbers in a single query:
“1234567890” OR “1234 56 78 90” OR “1234 567 890” OR …
AND operator (+) and NOT operator (-)
In this case, both words must be present. Google applies AND by default, but it still needs to be specified in other engines. Especially useful when searching for multiple phrases on the same site:
“Octopus Energy” + rates
Another good example is when searching for people with a certain profession in a certain geography on Facebook:
site:www.facebook.com director + Leeds
The NOT operator will have the effect of excluding certain results:
site:www.facebook.com director + Leeds -john
This will list the people with a certain profession in a certain geography, as long as they are not called John.
Search for a specific keyword in the URL (“inurl:”)
The URL (Universal Resource Locator) is the syntax that allows your browser to retrieve a resource published somewhere on the web. Think about it as the address for any content on the internet. URLs direct your browser to another computer (server) by pointing out a unique resource and is often “designed” in a way that helps humans understand where the resources are.
Therefore searching for keywords within the URL can help you find very specific content. For example:
inurl:CV
will allow you to search indexed pages that contain the word CV in the URL. You can combine in with other techniques described above to get some fairly interesting content. For example:
inurl:CV filetype:*.pdf "John Smith"
REMINDER: before you type the above into Google, make sure you understand that Google knows who you are, where you are and what you do online. You will likely get a “Captcha” request when you type the above search into the browser. This is your first warning, and Google will flag your search as suspicious. I strongly recommend using DuckDuckGo.com or another search engine that will not track you!
Search for a specific text in the body of the webpage (“intext:” and “allintext:”)
You need to provide a single keyword (intext) or multiple keywords separated by a space (allintext) to display all pages that contain all the keywords.
Make sure your spelling is correct as the results will be very specific and narrower than a “normal” search.
Searching for something in the title (“intitle:”)
As above, but will search only website titles
Broadening the search (“~”)
The tilde symbol (~) can be used to broaden search results, as it will include synonyms of the keyword marked with this symbol. For example:
~CV "John Smith" (click here)
This will result in plenty of garbage (not a very specific search). It might include many job postings, templates, CV writing guides, etc. You can, however, combine the tilde with NOT operator:
~CV “John Smith” -example -job -template (click here)
The above will give you far more refined results.
Search for variations in the root words (*)
The asterisk (*) allows you to search results when you don’t know there are multiple variations of a root word or phrase. Example:
motivat* - will find “motivate”, “motivates”, “motivation”, “motivational”, etc.,
child* - will find “children” as well as “child”,
*site - will return results such as “site”, and “website”.
Finding sites related to a particular URL (“related”)
This operator is best used to explore how Google ranks your website. Or a website of your competitor.
related bbc.co.uk - will return results from other news sources, including social media, press, app stores, and others. Helps to understand how Google categorises and “sees” your website.
Searching for a price (“£”, “$”, etc.)
You can use currency operators to easily find prices of products or services. Works really well with US dollars, Euro, British Pounds, Japanese Yen, and Chinese Yuan, but this is fast growing to include other currency operators.
Searching for a range (“X…Y”)
You can find a specific range, for example, price, you need to place three dots (“…”) between two numbers to include all numbers from the range in the search query. Helpful with dates, times, and everything sequential:
Book £ 5…15
This will find books that cost between £5 and £15. You can add additional operators to narrow your search to a specific store or topic as well (you already know how to do it)
Searching for words not more than X amount of words apart (“NEAR/X”)
Use when searching for 2 keywords to appear in the results no further than X words apart:
sunak NEAR/10 scandal (click here)
Will provide websites where the British Prime Minister, or his family members who share the same surname appear within 10 words from “scandal”.
Searching for specific files (“filetype:”)
Allows to find specific file types. For example: powerpoint presentations:
filetype:ppt (click here)
Very useful when searching for e-mail addresses of people, but only on Excel spreadsheets. Popular file type extensions include:
Spreadsheets: .xls, .xlsx, .ods, .numbers
Documents: .doc, .docx, .pages, .odt, .rtf., .txt, .xml, .pdf
Presentations: .ppt, .pptx, .odp., pages
Grouping search operators to help structure your query
For example, the query below will allow you to find specific sites.
Try the following (again - don’t be surprised if you get a “check” from Google; consider using DuckDuckGo)
allintitle:<< ip camera >> (moscow OR warsaw)
Even the first result will be interesting.
Try to play around with everything above and see how you can combine them to give you what you are looking for. Those are basic tools. Expect that you will fail a couple of times before you learn to reliably use those queries in everyday searches.
Part 2: More advanced Google search techniques
This part will show you some additional tips on how to effectively search for information in free and publicly available resources. This is still the “visible 10%”.
Because search engines have really good indexing automation, it is not uncommon that some sensitive information finds itself on the web. This means that you might be exposing too much information about yourself (most people don’t even realise they are doing so). But so can your competitors.
It is quite easy to find access to non-public information, usernames, passwords and vulnerabilities by ONLY using search engines, and without any technical knowledge.
I WILL WARN YOU AGAIN: Google knows who you are, where you are and what you do. Those types of searches WILL BE flagged and reported. This will attract unwanted attention. I strongly recommend using the knowledge for educational purposes only, and with good intentions. Or at least use very good protection… even a crypto-paid off-shore VPN and a well-run generic open-source operating system on a Virtual Machine with spoofed MAC address might NOT be enough!)
For example, if you happen to learn how to search and you come across an index of aisle with sensitive data, because someone forgot to sufficiently protect it, you are not technically breaking any laws, but downloading and using the data would result in people in suits raiding your home.
You also need to understand that Google will start blocking your connection if your Internet Provider assigns you a static IP number. Consider backing off if you see a sudden request for “captcha”. This can happen very quickly. Don’t panic - it is just a warning.
Keep reading with a 7-day free trial
Subscribe to Solutions Manual to keep reading this post and get 7 days of free access to the full post archives.