Click

Anonymizer as a replacement for the proxy. Checking the validity of proxy

Скрипт для составления и проверки списка веб-прокси

Stumbled on an interesting note titled " And a little more about the Google Hack ", in which the author describes the use of proxy (anonymizer example - the site Anonymouse ) Instead of public proxies to bypass the captcha on Google.
This method of using proxy I also liked, and I decided to write a script to collect and verify the validity of the list of public web proxy.

Benefits to the anonymizer "classic" public proxy

  • Proxies, to distinguish from a public proxy, rarely die and are almost always available online
  • Proxies usually provide a speed higher than that of public proxies or Tor
  • Anonymizer will not only hide your IP-address, but, depending on the settings, can hide cookies, user-agent, etc. "Tails"
  • Working through a web proxy easier to "teach" their program - enough to pass the encoded string interface anonymizer

For what is useful and what can be useful anonymizer (web proxy)?

  • for use in conjunction with the parser of search engines - to bypass captcha, issued by the SAR when you receive a large number of requests from one address
  • sending requests to the correct site through different proxies, you can cheat counter visits (this hypothesis needs verification)
  • scripts for posting in various forums for posting comments on sites
  • in any other parsing, where there is a risk the ban, anonymizer is also useful (for example, when parsing a directory site nakolesah.ru , which I mentioned)

Collect a list of public web proxy

Build and test the proxy list, we entrust perl-script, some fragments of which are given below and the full text as usually available for download in the " Soft "(ibid. it will update.)

To run the script in the mode of the list of proxies, you need to pass through the-i option value or google ajax:
anocheck.pl -i google
Explanation of options:

  • google - search for public Web proxies used parsing issue Web search Google. The list in this case is large enough, but there is a chance to get a captcha or a temporary ban
  • ajax - a list of proxies obtained from a query to the Google API to search. At the output of 8 results, but there is no captcha.

It seems to me the best version of this script - the primary proxy list is compiled using the option google, then transferred to the test list file.

To search for a proxy, working on engines and PHPProxy Glype use the following query parameters:

A
2
3
4
# 1 - on the engine PHPProxy
= '"Rotate13" "Base64" "Strip" inurl:index.php?q=' ; my $ phproxy_sreq = '"Rotate13" "Base64" "Strip" inurl: index.php? q =';
# 2 - on the engine Glype
= '"Encode URL" "Allow Cookies" "Remove Scripts" inurl:browse.php?u=' ; my $ glype_sreq = '"Encode URL" "Allow Cookies" "Remove Scripts" inurl: browse.php? u =';

This is followed by parsing the results of Google and entering addresses found in the web proxy list:

A
2
3
4
5
6
7
8
# 1 - The search engine based on PHProxy
$source =~ m #<h3 class="r"><a href="(https?://w{0,3}\.?[\w-]+\.[az]{2,4}[/\w-]*/index\.php)\?q#ig) { while ($ source = ~ m # <h3 class="r"> <a href = "(https?: / / w {0,3} \.? [\ w-] + \. [az] {2, 4} [/ \ w-] * / index \. php) \? q # ig) {
{ $1 } ++; $ Proxy_list -> {$ 1} + +;
}
# 2 - look for work on Glype
$source =~ m #<h3 class="r"><a href="(https?://w{0,3}\.?[\w-]+\.[az]{2,4}[/\w-]*/browse\.php)\?u#ig) { while ($ source = ~ m # <h3 class="r"> <a href = "(https?: / / w {0,3} \.? [\ w-] + \. [az] {2, 4} [/ \ w-] * / browse \. php) \? u # ig) {
{ $1 } ++; $ Proxy_list -> {$ 1} + +;
}

Checking the validity of proxy

In addition to creating a list of web proxy, the script can check the existing list for validity, it is enough to send him through the-i option to name the file containing the proxy list:
anocheck.pl -i proxy.txt

Mechanism to verify the validity of proxies found not too complicated (I took his idea of ​​the notes listed in the first paragraph) - each found anonymizer sends a request to open the main page of Google, which is then parsed to see if it the correct title. If the header is present - consider proxies working - otherwise transfer to the list of public holidays:

A
2
3
4
5
6
7
8
9
10
11
12
13
14
( keys %$proxy_list ) { foreach my $ proxy_url (keys% $ proxy_list) {

= $ua -> get ( $proxy_url . '?q=' . encode_base64 ( 'http://www.google.com' ) ) ; my $ response = $ ua -> get ($ proxy_url. '? q ='. encode_base64 ('http://www.google.com'));
# Warn "Error: $ response-> status_line \ n" unless $ response-> is_success;

$response -> decoded_content =~ m #<title>Google</title>#) { if ($ response -> decoded_content = ~ m # <title> Google </ title> #) {
"%-45s %10s" , $proxy_url , " \x 1b[32m [OK] \x 1b[0m \n " ) ; printf ("%-45s% 10s", $ proxy_url, "\ x 1b [32m [OK] \ x 1b [0m \ n");
}
else {
"%-45s %10s" , $proxy_url , " \x 1b[31m [ERROR] \x 1b[0m \n " ) ; printf ("%-45s% 10s", $ proxy_url, "\ x 1b [31m [ERROR] \ x 1b [0m \ n");
@bad_proxy , $proxy_url ) ; push (@ bad_proxy, $ proxy_url);
$proxy_list -> { $proxy_url } ) ; delete ($ proxy_list -> {$ proxy_url});
}
}

The results validate the proxy list

As a result, proxy checking the validity of results in two files (by default, with names and good.txt bad.txt), containing, respectively, lists of recent and not validated proxy.

Valid proxies, as mentioned above, you can connect to the parser, and the invalid - from time to time to check again (do not overwrite the list of valid, as supplemented), and in general usage found Web proxy depends on the availability of their own ideas, each, of which I wish you more good-bye!

More on similar topics:

Category Filed under: Internet , Coding , Search Engines | Tag Comments 8 comments

Comments

8 comments to "Anonymizer as a replacement for the proxy. Checking the validity of proxy "

  1. Anton wrote:

    Excellent article, I am sure will be helpful to many. Sorry, my technical level does not allow to use it.

  2. Mad Programmer writes:

    Thank you for mentioning my blog :) Site by the way you have an interesting, signed up.

  3. Den writes:

    this is great! good article!

  4. anka wrote:

    I always use the site dostupest.ru, on the other can catch vyrusov

Leave a Reply