public class HTMLExtractor extends Object
| Modifier and Type | Method and Description |
|---|---|
static boolean |
contains(String url,
String markup,
String selector,
String value)
Checks if any of the elements from the
markup identified by the selector contain the text from value. |
static boolean |
exists(String url,
String markup,
String selector)
Checks if the
selector identifies an element from the markup. |
static boolean |
hasAttribute(String url,
String markup,
String selector,
String attributeName)
Checks if any of the elements matched by the
selector contain the attribute attributeName. |
static boolean |
hasAttributeValue(String url,
String markup,
String selector,
String attributeName,
String attributeValue)
Checks if any of the elements matched by the
selector contain the attribute attributeName with value attributeValue. |
static boolean |
hasChildren(String url,
String markup,
String selector,
int howMany)
Checks if the first element matched by the
selector has children and if their number is equal to howMany. |
static boolean |
hasClosingTag(String url,
String markup,
String selector) |
static String |
innerHTML(String url,
String markup,
String selector)
Retrieves the content of the matched elements, without their own markup tags, identified by the
selector from the given
markup. |
public static String innerHTML(String url, String markup, String selector)
selector from the given
markup. The url is used only for caching purposes, to avoid parsing multiple times the markup returned for the
same resource.url - the url that identifies the markupmarkup - the markupselector - the selector used for retrievalpublic static boolean contains(String url, String markup, String selector, String value)
markup identified by the selector contain the text from value. The
url is used only for caching purposes, to avoid parsing multiple times the markup returned for the same resource.url - the url that identifies the markupmarkup - the markupselector - the selector used for retrievalvalue - the text that should exist in the markuptrue if the value was found in the markup, false otherwisepublic static boolean exists(String url, String markup, String selector)
selector identifies an element from the markup. The url is used only for caching purposes,
to avoid parsing multiple times the markup returned for the same resource.url - the url that identifies the markupmarkup - the markupselector - the selector used for retrievaltrue if the element identified by the selector exists, false otherwisepublic static boolean hasAttribute(String url, String markup, String selector, String attributeName)
selector contain the attribute attributeName. The url is used
only for caching purposes, to avoid parsing multiple times the markup returned for the same resource.url - the url that identifies the markupmarkup - the markupselector - the selector used for retrievalattributeName - the attribute's nametrue if the attribute was found, false otherwisepublic static boolean hasAttributeValue(String url, String markup, String selector, String attributeName, String attributeValue)
selector contain the attribute attributeName with value attributeValue. The url is used only for caching purposes, to avoid parsing multiple times the markup returned for the
same resource.url - the url that identifies the markupmarkup - the markupselector - the selector used for retrievalattributeName - the attribute's nameattributeValue - the attribute's valuetrue if the attribute was found and has the specified value, false otherwisepublic static boolean hasChildren(String url, String markup, String selector, int howMany)
selector has children and if their number is equal to howMany.url - the url that identifies the markupmarkup - the markupselector - the selector used for retrievalhowMany - the number of expected childrentrue if the number of children is equal to howMany, false otherwiseCopyright © 2020. All rights reserved.