Reality Check - Special Search - Criteria Not in URL

Hello! Before losing myself in the documentation and trying to script something, I wanted to double-check whether it would be theoretically possible to do the following.

The legal opinions of my country’s Supreme Court can be searched and downloaded using this webpage

But once you fill in the search criteria in the left frame and click “Buscar” (search), the search terms are not incorporated in the URL. Instead, their system just returns a preview of the results you can navigate on the right frame and then download.

An example download link might give us more information on how the process works and whether or not it would be DEVONagent compatible: http://190.217.24.13:8080/WebRelatoria/FileReferenceServlet?corp=csj&ext=doc&file=698410

Where “corp=csj” identifies the court/corporation (csj - Corte Suprema de Justicia); ext=doc identifies the file extension (you can also download HTML and pdf); and file=698410 the file number within their database.

Any ideas of the doability of this kind of search/crawl within DEVONagent and which tools should I investigate to try to do so?

Thank you!

That’s what you (don’t) see. What happens behind the scenes is a POST via XHR (XhtmlRequest) that looks like this. They probably use POST over GET because the request can exceed 1024 characters.

javax.faces.partial.ajax=true&javax.faces.source=searchForm%3AsearchButton&javax.faces.partial.execute=%40all&javax.faces.partial.render=resultForm%3AjurisTable+resultForm%3ApagText2+resultForm%3AselectAllButton&searchForm%3AsearchButton=searchForm%3AsearchButton&searchForm=searchForm&searchForm%3AtemaInput=SEPARACION&searchForm%3Ascivil_focus=&searchForm%3Aslaboral_focus=&searchForm%3Aspenal_focus=&searchForm%3Asplena_focus=&searchForm%3Arelevanteselect=&searchForm%3Aoptions1=0&searchForm%3AfulltxtInput=&searchForm%3Aset-fulltxt_collapsed=true&searchForm%3AponenteInput=&searchForm%3Aset-ponente_collapsed=true&searchForm%3AfechaIniCal=&searchForm%3AfechaFinCal=&searchForm%3Aset-fecha_collapsed=true&searchForm%3AradicadoInput=&searchForm%3Aset-radicado_collapsed=true&searchForm%3AprovidenciaInput=&searchForm%3Aset-providencia_collapsed=true&searchForm%3AidInput=&searchForm%3Aset-id_collapsed=true&searchForm%3AtipoInput=&searchForm%3Aset-tipo_collapsed=true&searchForm%3AclaseInput=&searchForm%3Aset-clase_collapsed=true&searchForm%3AfuenteInput=&searchForm%3Aset-fuente_collapsed=true&searchForm%3AjurisInput=&searchForm%3Aset-juris_collapsed=true&searchForm%3AprocedenciaInput=&searchForm%3Aset-procedencia_collapsed=true&searchForm%3AdelitosInput=&searchForm%3Aset-delitos_collapsed=true&searchForm%3AsujetosInput=&searchForm%3Aset-sujetos_collapsed=true&searchForm%3AservidorInput=&searchForm%3Aset-servidor_collapsed=true&searchForm%3AcategoriaInput=&searchForm%3Aset-categoria_collapsed=true&javax.faces.ViewState=-8842560043546355949%3A4428678814168785817

In case you’re wondering, I was looking for “separacion” (all in lower cases), which was transformed by the JavaScript on the page into “SEPARACION” all in upper case and then used to build this search string. This kind of thing can be simulated in JavaScript (or most any programming language of your choice) or curl or something similar. But it is certainly not fun. If you’re only looking for simple things (like I did in the example) above, you might get away with a pre-built URL that you just augment with the search parameter.

The download link actually doesn’t tell you anything except for the file number. Which is probably part of the server response, that contains this

TUTELA</font><br><font face="verdana" size="3" color="7D3B05"><b>ID: </b></font><font face="verdana" size="3" color="000000">697074</font>

(as a side node: this is terrible, terrible web programming straight from the hell of the 80s). So what one could try to do is

  • send a pre-built URL to the server using
  • take the server’s answer and dissect it to get at the file #
  • build the download link for the file according to your example above.

If all this is possible with DEVONagent, I don’t know. Technically, it would probably be feasible in a programming language or with command line tools like curl. However, you’d have to analyze the POST request for different search parameters first in order to be able to build the appropriate POST requests (aka “URL”) yourself. And then you’ll have to parse the server’s response to get at the file #.

Thanks for your response! Sounds more complicated than I expected.

Although not what I expected, a pre-built URL might actually prove useful. I lost you at that step though; would it be a URL I can input in the web browser or something I must do with some programming?

If for example, I wanted to go straight to the results of your “separation” search, without downloading any file, what would I have to do?

Thanks again!

Note: This search engine requires JavaScript to search and dispaly results so you can’t create a search plugin for this site.

If DEVONagent can be scripted, it might be possible to do JavaScript somewhere. But I don’t know the software. Even if it’s possible, it will be a pain with this website. They never got around to separate data from its representation, so one has to wade through a swamp off HTML to find a simple number.

The command is available in DEVONagent’s dictionary but yes, it would likely be a trudge through the mud.

Got it. Thanks!