Guava Resources.readLines() для файлов Zip/Gzip

Я нашел Resource.readLines() и Files.readLines(), чтобы помочь упростить мой код.
Проблема в том, что я часто читал gzip-сжатые txt-файлы или txt-файлы в zip-архивах из URL (HTTP и FTP).
Есть ли способ использовать методы Гуавы для чтения из этих URL? Или это возможно только с помощью Java GZIPInputStream/ZipInputStream?Guava Resources.readLines() для файлов Zip/Gzip

источник

2015-08-14 user1775213

Если вы на Java 8, то вы можете использовать 'BufferedReader # строки()'. –

Ping! Я добавил «ByteSource» для Zip в своем ответе. –

Вы можете создавать свои собственные ByteSource S:

Для GZip:

public class GzippedByteSource extends ByteSource { 
    private final ByteSource source; 
    public GzippedByteSource(ByteSource gzippedSource) { source = gzippedSource; } 
    @Override public InputStream openStream() throws IOException { 
    return new GzipInputStream(source.openStream()); 
    } 
}

Затем использовать:

Charset charset = ... ; 
new GzippedByteSource(Resources.asByteSource(url)).toCharSource(charset).readLines();

Вот реализация для Zip. Это предполагает, что вы читаете только одну запись.

public static class ZipEntryByteSource extends ByteSource { 
    private final ByteSource source; 
    private final String entryName; 
    public ZipEntryByteSource(ByteSource zipSource, String entryName) { 
    this.source = zipSource; 
    this.entryName = entryName; 
    } 
    @Override public InputStream openStream() throws IOException { 
    final ZipInputStream in = new ZipInputStream(source.openStream()); 
    while (true) { 
     final ZipEntry entry = in.getNextEntry(); 
     if (entry == null) { 
     in.close(); 
     throw new IOException("No entry named " + entry); 
     } else if (entry.getName().equals(this.entryName)) { 
     return new InputStream() { 
      @Override 
      public int read() throws IOException { 
      return in.read(); 
      } 

      @Override 
      public void close() throws IOException { 
      in.closeEntry(); 
      in.close(); 
      } 
     }; 
     } else { 
     in.closeEntry(); 
     } 
    } 
    } 
}

И вы можете использовать его как это:

Charset charset = ... ; 
String entryName = ... ; // Name of the entry inside the zip file. 
new ZipEntryByteSource(Resources.asByteSource(url), entryName).toCharSource(charset).readLines();

источник

2015-08-14 16:35:06

'GzipInputStream' должен быть' GZIPInputStream' – nezda

Как сказал Оливье Грегуар, вы можете создать необходимые ByteSource с для любой схемы сжатия, нужно для того, чтобы использовать функцию readLines гуавы в.

Для архивов zip, хотя, хотя это возможно, я не думаю, что это того стоит. Будет проще сделать свой собственный метод readLines, который выполняет итерацию по записям zip и читает строки каждой записи самостоятельно. Вот класс, который демонстрирует, как читать и выводить строки в URL, указывающей на почтовый архив:

public class ReadLinesOfZippedUrl { 
    public static List<String> readLines(String urlStr, Charset charset) { 
     List<String> retVal = new LinkedList<>(); 
     try (ZipInputStream zipInputStream = new ZipInputStream(new URL(urlStr).openStream())) { 
      for (ZipEntry zipEntry = zipInputStream.getNextEntry(); zipEntry != null; zipEntry = zipInputStream.getNextEntry()) { 
       // don't close this reader or you'll close the underlying zip stream 
       BufferedReader reader = new BufferedReader(new InputStreamReader(zipInputStream, charset)); 
       retVal.addAll(reader.lines().collect(Collectors.toList())); // slurp all the lines from one entry 
      } 
     } catch (IOException e) { 
      throw new UncheckedIOException(e); 
     } 
     return retVal; 
    } 

    public static void main(String[] args) { 
     String urlStr = "http://central.maven.org/maven2/com/google/guava/guava/18.0/guava-18.0-sources.jar"; 
     Charset charset = StandardCharsets.UTF_8; 
     List<String> lines = readLines(urlStr, charset); 
     lines.forEach(System.out::println); 
    } 
}

источник

2015-08-14 17:30:34 heenenee

ответ

Смежные вопросы