23.05.2018
Semalt Expert: How To Extract Text
From Web Pages
While there are scraping tools out there capable of extracting data from multiple pages in a matter of seconds, the
one sure way of extracting text from web pages has always been highlighting and copying the text. But this method
is somewhat cumbersome especially in cases whereby you have to copy text from multiple pages. Also, web
developers are coming up with means of locking up a web page's content to prevent "copying" it.
'Now to start off, there are various quick methods of extracting text from web pages. Depending on the amount of
text you want to obtain you can choose between the following modes:
1. Save-page method
This technique relies on the ability of browsers to save a copy of the
current web page locally. To do so simply hold control+S buttons together
or you can right-click on the page, and select save the page from the
popup menu. This will launch an explorer window that requires you to
specify some attributes of the web page.
On the lower section, there's a " lename" option that will give you the opportunity to specify the name of the web
page le. It's important to note that the browser will also create a folder with a similar name that will contain all the
attached data from the webpage such as images and backdrops.
http://rankexperience.com/articles/article2374.html 1/2
23.05.2018
Below that, there is a "save as type" option that allows you to specify which le type you want to be saved as.
Considering that we are interested in text only select save as ".txt" which will automatically create a text le
containing all of the web page's text and can be edited using any word processor. This method is especially useful in
scenarios where you have to copy full pages. In case you need to leave out some parts of the text simply open the
text le and cut out the unnecessary text.
2. Ctrl+C and Ctrl+V method
This is probably the oldest trick in the book, by only using your mouse to
highlight the text that you wish to extract you can then proceed to copy it
and paste it elsewhere. This method is useful when you need to copy
snippets and quickly use them in another document.
To perform this, you need to scroll to the part containing the text that you
require, press and hold the left mouse button to switch the cursor from
"navigation" mode to "highlight" mode. This will allow you to highlight the text, to do so continue holding the left
mouse button and move the cursor around to highlight your text. When you are done, release the button and right
click on the text that you have copied to pop up the navigation menu. On it clicks on "copy" option to copy the
selected text.
Navigate to the text document where you want to save the text and right click to pop up the menu and click on
paste.
It's important to note that you can select between various paste modes but if you are interested in text only, click on
paste as plain text.
http://rankexperience.com/articles/article2374.html 2/2