I'm trying to extract a text in hebrew from a web page
https://www.sefaria.org/Berakhot.2a.2?lang=he&with=all&lang2=he
But the result is just the first P, so I tried with this option too:
Elements elements = document.select(".segmentNumber sans .content-section p");
But nothing happen.
Can you tell me what's wrong with the code, and how can I get all the P's elements from the web page?
Thanks.
this is the code
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
public class Scraping {
public static void main(String[] args) throws IOException {
try{
Document document = Jsoup.connect("https://www.sefaria.org/Berakhot.2a?lang=he").get();
System.out.println( document.text() );
System.out.println("Selecting HTML tag name having specified class name");
Elements elements = document.select("p.he");
if(elements.size() > 0)
System.out.println(elements.get(0));
}catch(IOException ioe){
System.out.println("Unable to connect to the URL");
}
}
}
Aucun commentaire:
Enregistrer un commentaire