Javascript to access APIs

I am working on some projects to access APIs. Eventually I may build a web app but for now for testing I am using HTML pages in Devonthink - aside from ease of archiving that seems to avoid XSS issues.

As an example this works fine in DT3 to use the Reddit API and retrieve a few posts:

<html dir="auto">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
</head>
<body style="margin:1.5em; font-family:'Times-Roman','Times'; font-size:24px">
</body>
</html>


<script>
/* Use the Reddit API to retrieve a random Reddit post */
var xhr = new XMLHttpRequest();
xhr.open('GET', 'https://www.reddit.com/r/random.json', true);
xhr.onreadystatechange = function() {
  if (xhr.readyState == 4) {
    var data = JSON.parse(xhr.responseText);
    var post = data[0].data.children[0].data;
    var postDiv = document.createElement('div');
    postDiv.innerHTML = '<a href="' + post.url + '">' + post.title + '</a>';
    document.body.appendChild(postDiv);
  }
};


xhr.send();
/* accept a subreddit name and an integer and returns the top n posts from that subreddit */
var subreddit = 'javascript';
var n = 5;
var url = 'http://www.reddit.com/r/' + subreddit + '.json?limit=' + n;
var request = new XMLHttpRequest();
request.open('GET', url, true);
request.onload = function() {
  if (request.status >= 200 && request.status < 400) {
    var data = JSON.parse(request.responseText);
    var posts = data.data.children;
    for (var i = 0; i < posts.length; i++) {
      var post = posts[i].data;
      var postDiv = document.createElement('div');
      postDiv.innerHTML = '<a href="' + post.url + '">' + post.title + '</a>';
      document.body.appendChild(postDiv);
    }
  } else {
    console.log('error');
  }
};
request.send();
</script>

I am trying to use essentially the same approach to retrieve PMIDs from Pubmed as part of a literature search. This code fails to give any response in DT3 - yet the URL itself gives a valid response.

Any idea why the Javascript does not work to retrieve data from the Pubmed API?

https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=low+back+pain&retmax=10

<html dir="auto">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
</head>
<body style="margin:1.5em; font-family:'Times-Roman','Times'; font-size:24px">
</body>
</html>

<header>
Pubmed Test
</header>

<script>
/* Access the Pubmed Esearch utility to search for low back pain */
var xhr = new XMLHttpRequest();
xhr.open('GET', 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=low+back+pain&retmax=10',true);
xhr.onload = function() {
  if (xhr.status === 200) {
    console.log('User\'s name is ' + xhr.responseText);
  }
  else {
    alert('Request failed.  Returned status of ' + xhr.status);
  }
};
xhr.send();

</script>

What happens when you load the HTML doc in a browser, preferably Safari? Do the developer tools tell you anything?

Using Safari and looking at the console -

For the Reddit request I get an XSS error in the console but nonetheless the page loads fine:

For the Pubmed Response the Javascript does not run and it gives an odd error about an icon. Most interesting - I can see the desired response from the API in the console(!). It just does not render it in the browser.

I’d rather see the messages as text, light on dark is terrible to read.

Also, your second script doesn’t create any DOM elements, so why would the browser display anything?

Thank you - this version works

<header>
Pubmed Test
</header>

<script>

/* Access the Pubmed Esearch utility to search for low back pain using usehistory=y.  Parse the result as XML and output each Id,  WebEnv, and QueryKey  using document.write */
var xmlhttp = new XMLHttpRequest();
xmlhttp.open("GET", "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=low+back+pain&usehistory=y", false);
xmlhttp.send();
var xmlDoc = xmlhttp.responseXML;
var webenv = xmlDoc.getElementsByTagName("WebEnv")[0].childNodes[0].nodeValue;
var querykey = xmlDoc.getElementsByTagName("QueryKey")[0].childNodes[0].nodeValue;
var idlist = xmlDoc.getElementsByTagName("IdList")[0].childNodes[0].nodeValue;
document.write("<p>WebEnv: " + webenv + "</p>");
document.write("<p>IdList: " + idlist + "</p>");

</script>

Output:

** Very Interestingly - With just a bit of tweaking of my Prompt (see the comments) the OpenAI Javascript Sandbox was able to generate this script automatically. Quite cool. It takes some practice to learn what prompt to give - which itself is helpful to think through how the language works. But overall it is a huge timesaver and great educational tool.

Yes, the AI is able to churn out code that humans have written for at least 15 years and of which the net is overflowing. After we’ve learned to prod it in the right direction.
Which is the irritating part: if I have to have an idea of the right solution, then what’s the AI actually doing apart from running a net search?

If at all possible, I’d try to keep away from XML and use JSON, btw. A lot easier to work with.

Agreed - but the PubMed E-Utilities (an “API” which predates the modern Internet) offer “JSON” output which actually is ASN.1 - at first glance it seems like JSON but it really is not and cannot be parsed with the usual Javascript functions. ASN.1 is a predecessor to JSON.

1 Like

So do you think users of Github Copilot are mostly professional programmers who find AI code to be more efficient than searching the Internet, or are they more “amateur” programmers like me who spend full-time in some other field and find it to be helpful to have AI software which fills in a lot of the minutiae/syntax of code and thus makes it a lot easier when I do not code daily?GitHub Copilot · Your AI pair programmer · GitHub

First: This code does not produce the same output as in your screenshot here. In fact, the IDList is empty. Which is quite understandable, given the code.

I’ll try to dissect it:
xmlDoc.getElementsByTagName("IdList")[0] returns the first HTMLElement having the tag name ‘IdList’. That is, as one can see in the debug view of the browser’s developer tools, an element with 41 childNodes. The code doesn’t care about that, it takes the first of these childNodes with childNodes[0] and then gets at its value with nodeValue. This first child is a Text node with the content “\n”. It’s the linefeed between the closing “>” of “IdList” and the opening “<” of “Id”. And that’s exactly what IdList gets set to here – a linefeed. I don’t know how you got the output in your screenshot from that code.

If I use var idlist = xmlDoc.getElementsByTagName("IdList")[0].innerHTML;, at least I get the ID values, though not one on each line (one would have to do some more formatting for that). And that is probably not valid HTML either, because id is not an HTML element.

I won’t know. Not using it myself. I was only talking about the code you posted here, which is from XMLHTTPRequest101. It works, kind of. All I’m saying is that instead of having an AI search the net for you, you could’ve searched it yourself :wink:

What I would find intriguing: If you throw your initial posting at the AI (i.e. the one with “this works, but this doesn’t”) and the AI tells you immediately “well, if you do not set DOM elements, you won’t see anything. Let’s do it this way…”

Aside: document.write is really not what one should do in 2023, given that we do have a nice DOM interface since quite some time (you used that yourself in the other sample). And I’d expect an AI to also tell you that :wink: BTW, the AI does not try to output the QueryKey, although you told it to do so? And, is the output even valid HTML? I’m not so sure given that idList is an array of id elements, which HTML does not know about.

Besides the code not working here, it is a bad example (in my opinion) of how one would do it (it’s even lacking the most basic error handling that your original version provided for). Rather, it is what someone with no knowledge of programming would scramble together after some days with their search engine.

Oh, did I mention that every single object is declared as a global var without any need for that? Which really points to old and bad sources for the code.

I’d love to see ChatGPT’s answer to the questions “Why might some people consider this code to be bad? How could it be improved?”

I might have posted the wrong version of the code as I was working on a number variants of it. The code below does definitely “worK” though I agree it is not pretty :slight_smile:

The points you make about “bad code” are definitely interesting and I think worth discussion.

First of all - Pubmed is somewhat of an oddball case. On the one had, it is a really important information source for which there is no real alternative given my primary goal of accessing the medical literature. But National Library of Medicine in USA invented this “API” before we even had a public Internet as we know it today. So it does not work in ways that are “standard” and accepted for virtually all websites today. That means it is a particularly difficult site to programmatically access via AI because AI generally assumes a modern API architecture which does not exist in this situation.

That said - I really do like the OpenAI Codex Javascript Sandbox and I am very likely to start using Github Copilot regularly. I completely agree that the resulting code may be outdated or inefficient. But it works! And I can create it, edit it, and tweak it myself in a lot less time than it would take me to write it from scratch.

There is no doubt that if I were to apply for a job as a developer and showed a portfolio with code like this I would be laughed out the door. But what if I am not a “professional” developer but instead a “citizen developer” i.e. a physician or scientist or manager or attorney or someone else who is the end-user of software. I think the ability to quickly whip up “bad code” that “works” using AI is a huge benefit; I would think very highly of a colleague who does that. Sometimes that’s the easiest/quickest way to accomplish a 1-off task or a proof of concept of software. If it’s code that turns out to be regularly used, then I might well hire a “real” programmer to refactor it; but even then, it’s a whole lot easier for me to approach a programmer and ask him to bring my spaghetti code up to modern standards than for me to try to define the task from square 1; so it’s a win-win overall.

I think it’s important also to note that the AI also improves as my ability to give it instructions improves. I can specify things like “Do not use document.write” or “Use fetch” or “Use innerHTML” etc. So for example I will accept your constructive feedback (thank you) and will use those concepts when I create AI code in the future. It’s a whole lot easier for me to instruct AI something. like “Use innerHTML to output all of the metadata” than for me to write out all of the code myself and be sure not to miss a punctuation mark or parameter somewhere.

That said - I don’t know this for sure but I suspect the typical Github Copilot user is neither an “amateur citizen developer” like me nor a very experienced developer such as yourself. My guess is that its biggest target user is an early-career developer or student-developer working towards a skill set such as yours. Surely some experienced developers such as yourself will frown on that and say they should look it up manually. Is that good advice? Maybe a good analogy is when I see my daughter in medical school using 3D anatomy software to study concepts that I could only learn 30 years ago on a cadaver in gross anatomy lab; should I be old-school and insist on the merits of the classic way, or should I instead be envious that I did not have such study tools available when I was in medical school? The same is true when I see young student airplane pilots planning their routes all on computer who never had to plot it on paper as I did; should everyone learn the older techniques or do the newer techniques add more value? I think there are good arguments both ways in each of these fields - computing, medicine, aviation - and the underlying issue of technological change permeates them all.

<header>
Pubmed Test
</header>

<script>

/* Access the Pubmed Esearch utility to search for low back pain using usehistory=y.  Parse the result as XML and output each Id,  WebEnv, and QueryKey  using document.write.  Save the Webenv and QueryKey.  Then perform an eFetch using the WebEnv and QueryKey from the prior steps and the parameters &retmax=50&retmode=text&rettype=abstract. Output the result.  Use fetch instead of xmlhttprequest.  Keep carriage returns in text from fetch. */

var url = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=low+back+pain&usehistory=y';
fetch(url)
  .then(function(response) {
    return response.text();
  })
  .then(function(text) {
    var parser = new DOMParser();
    var xmlDoc = parser.parseFromString(text, "text/xml");
    var webenv = xmlDoc.getElementsByTagName("WebEnv")[0].childNodes[0].nodeValue;
    var querykey = xmlDoc.getElementsByTagName("QueryKey")[0].childNodes[0].nodeValue;
    var ids = xmlDoc.getElementsByTagName("Id");
    for (var i = 0; i < ids.length; i++) {
      document.write(ids[i].childNodes[0].nodeValue + '<br>');
    }
    document.write(webenv + '<br>');
    document.write(querykey + '<br>');
    var url2 = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&query_key=' + querykey + '&WebEnv=' + webenv + '&retmax=50&retmode=text&rettype=abstract';
    fetch(url2)
      .then(function(response) {
        return response.text();
      })
      .then(function(text) {
        document.write(text);
      });
  });
</script>

I’ll reply via PM, since the topic is not on topic here.