html – Extract site content with Java

Good evening I'm trying to use Apache Tika with the Java language to extract the content of a website for academic purposes. But the problem that I have not yet figured out is a way to extract specific HTML tags with the desired content. How do I find a tool to scratch with tika?