-
Notifications
You must be signed in to change notification settings - Fork 15
Finding text in web page
Péter Bencze edited this page Jun 19, 2019
·
1 revision
TextFinder is a helper class which can be used to find text in web elements.
A text matching pattern must be specified. Optionally, it is also possible to provide one or more locating mechanisms. These are used to locate web elements on the page, whose content is searched for matching text. If not specified, the By.tagName("body")
locator is used by default.
Use:
- findAllInResponse: to find all the text that match the pattern
- findFirstInResponse: to find the text that first matches the pattern
public class MyCrawler extends Crawler {
private final TextFinder textFinder;
public MyCrawler(final CrawlerConfiguration config) {
super(config);
// A helper class that is intended to make it easier to find text on web pages
textFinder = new TextFinder(Pattern.compile("text pattern", Pattern.MULTILINE));
}
@Override
protected void onResponseSuccess(final ResponseSuccessEvent event) {
textFinder.findFirstInResponse(event.getCompleteCrawlResponse())
.ifPresent(matchResult -> {
// Do something with the matched text...
});
// ...
}
}