I think this should be elementary, but I still can't get my head around it. Let's say there's fair amount of HTML documents and I need to catch every image URLs out of them.Экстрагирование конкретных <a href> URLs out of the document
The rest of the content changes, but the base of the url is always the same for example: http://images.examplesite.com/images/
,
So I want to extract every string that contains that part. the problem is that they're always mixed with <a href=''>
or <img src=''>
tags, so how could I drop them out? preg_match
probably?
возможно дубликат [PHP Xpath: получить все HREF значения, которые содержат иглы] (http://stackoverflow.com/questions/2392393/php-xpath-get-all-href-values-that-contain- игла) – Gordon
Вы также можете использовать DOM, как показано в [Preg_Match All A href] (http://stackoverflow.com/questions/1519696/preg-match-all-a-href/1519791#1519791). Просто измените XPath на тот, который указан в связанном дубликате. – Gordon
Я попробую :) – Seerumi