Click

Parser-grabber nakolesah.ru

Has been busy in free (from the main job), the time - did hack, which was requested to write a parser to grab the structure selection of wheels on the car from the site nakolesah.ru (Of course, the language of perl ).
And now he is ready (started last Sunday) and is being tested by the customer. The nice thing is that this is the first time that my hobby has brought a small coin (which I'll take on another hobby - hunting :) .)

I can not say that the parser is ideal. I stubbornly pursues the feeling that everything could be done easier and better :)
But, besides the fact that this is my first written in the order the script parser nakolesah.ru for me personally is also remarkable for several reasons:

  • First, I first used a multilevel data structure (before further list is attached to the hash does not fall) and deal with the dereferencing links, respectively;
  • Second - first encountered the aspx-scripts on the server (frankly, there are not the most pleasant feeling, as compared with perl and php. However, what to expect from microsoft?).

The data structure after parsing has already nakolesah seven levels of nesting, which at first somewhat scared and confused. However, due to the excellent book "Perl - exploring deeper" , look into the matter is not very difficult.

Here is a small piece of data structures for clarity:

 'Nissan' => {
          'Terrano' => {
                     '1994 '=> {
                            '30Di '=> {
                                   'Wheels' => {
                                             '8 X 16 ET10 '=> {
                                                      'Replacement' => 1
                                                                 }
                                             '7 X 15 ET12 '=> {
                                                      'OEM' => 1
                                                                 }
                                             '8 X 18 ET '=> {
                                                      'Replacement' => 1
                                                                }
                                             '8 X 17 ET '=> {
                                                      'Replacement' => 1
                                                                }
                                                   }

At the beginning of development, I suggested that the attachment will be even deeper - by adding at the end of the tree with an array of performance wheels and tires, but it was not necessary.

But so is the result of a parser nakolesah.ru (outputting to XML-file):

 <brand name="Chrysler">
	 <model name="Pacifica">
		 <year value="2005">
			 <modifi name="35i">
				 <type name="wheels">
					 <label name="8 x 17 ET38">
						 <completion> Replacement </ completion>
						 <axle> </ axle>
					 </ Label>
					 <label name="7,5 x 17 ET45">
						 <completion> OEM </ completion>
						 <axle> </ axle>
					 </ Label>
					 <label name="8 x 19 ET35">
						 <completion> Replacement </ completion>
						 <axle> </ axle>
					 </ Label>
					 <label name="8 x 18 ET35">
						 <completion> Replacement </ completion>
						 <axle> </ axle>
					 </ Label>
				 </ Type>
				 <type name="tires">
					 <label name="235/60 R18">
						 <completion> Replacement </ completion>
						 <axle> </ axle>
					 </ Label>
					 <label name="235/65 R17">
						 <completion> OEM </ completion>
						 <axle> </ axle>
					 </ Label>
					 <label name="235/55 R19">
						 <completion> Replacement </ completion>
						 <axle> </ axle>
					 </ Label>
				 </ Type>
			 </ Modifi>

In the meantime, a parser for the site nakolesah.ru tested by the customer, I think that slowly fastened to it the opportunity to resume the data, and possibly multi-threading (the last used library threads for almost a year ago, just see what's new in it).

With best wishes, dimio !

More on similar topics:

Category Filed under: Internet , Coding | Tag Tags: , , , | Comments 6 comments

Comments

6 comments to "parser-grabber nakolesah.ru"

  1. xxx writes:

    would like to buy please contact us by e-mail parser

  2. arhangel writes:

    On what basis are selling?

  3. pavel wrote:

    Need base, how much?

  4. zhenek writes:

    Why "natural language perl"? Why not PHP?

Leave a Reply