У меня есть следующая таблица, которая является выходом task-spooler.Чтение сложной таблицы с pandas ('task-spooler')
Легко для людей разобрать, но у меня проблемы с чтением его в панды DF.
Любая идея?
ID State Output E-Level Times(r/u/s) Command [run=1/2]
6 running /tmp/ts-out.FzVneG [l1]python infloop.py
0 finished /tmp/ts-out.ixWHm2 0 0.00/0.00/0.00 bash -c echo 1
1 finished /tmp/ts-out.ZzwS11 0 0.00/0.00/0.00 bash -c echo 1
2 finished /tmp/ts-out.GJlyge 2 0.00/0.00/0.00 bash -c
4 finished /tmp/ts-out.lIVMYH 2 0.00/0.00/0.00 bash -c -h
5 finished /tmp/ts-out.8EKHy1 -1 141.23/0.00/0.00 python infloop.py
3 finished /tmp/ts-out.lBr4Wy -1 2545.36/0.00/0.02 bash -c python infloop.py
7 finished /tmp/ts-out.kxCczi 2 0.01/0.00/0.00 bash -c
8 finished /tmp/ts-out.3VkfNh 0 0.00/0.00/0.00 echo
9 finished /tmp/ts-out.8ewxzl 0 0.01/0.00/0.00 echo
10 finished /tmp/ts-out.ahSLaY 0 0.00/0.00/0.00 bash -c echo $GPUID
11 finished /a/home/cc/cs/yuvval/tmp/ts-out.3dpaBO 0 0.00/0.00/0.00 bash -c ls
12 finished /tmp/ts-out.ADWkve 0 0.00/0.00/0.00 bash -c ls
13 finished /a/home/cc/cs/yuvval/tmp/ts-out.xm0jtn -1 130.67/0.00/0.02 bash -c python infloop.py
14 finished /tmp/ts-out.HxBqkm 0 0.00/0.00/0.00 bash -c echo 11
15 finished /tmp/ts-out.ERNuaE 0 0.00/0.00/0.00 bash -c echo
16 finished /tmp/ts-out.9j6hkS 0 0.00/0.00/0.00 bash -c echo $GPUID
17 finished /tmp/ts-out.Y5QDNa 0 0.00/0.00/0.00 bash -c echo $GPUID
18 finished /tmp/ts-out.EIHhoX -1 0.00/0.00/0.00 %s
19 finished /tmp/ts-out.LLw2Wl -1 0.00/0.00/0.00
20 finished /tmp/ts-out.deWAJR -1 0.01/0.00/0.00 echo $GPUID
21 finished /tmp/ts-out.AdZFIf -1 0.00/0.00/0.00 echo 12
22 finished /tmp/ts-out.NBOCVv 0 0.00/0.00/0.00 echo 12
23 finished /tmp/ts-out.5WpfPu 0 0.00/0.00/0.00 echo
24 finished /tmp/ts-out.1lw4bS -1 0.00/0.00/0.00 echo
25 finished /tmp/ts-out.7MNGLQ 0 0.00/0.00/0.00 bash -c echo $GPUID
26 finished /tmp/ts-out.8FZ3on 0 0.00/0.00/0.00 bash -c echo $GPUID
Моя лучшая попытка была:
from StringIO import StringIO as sIO
std = ... # the table text
pd.read_table(sIO(std), sep='\s+', engine='python')
Ошибка:
ValueError: Expected 7 fields in line 2, saw 9
EDIT: Исходный код, который генерирует таблицу доступен. Вот команды для генерации каждой строки. Может ли это помочь при чтении таблицы к кадру данных?
if (p->label)
snprintf(line, maxlen, "%-4i %-10s %-20s %-8i %0.2f/%0.2f/%0.2f %s[%s]"
"%s\n",
p->jobid,
jobstate,
output_filename,
p->result.errorlevel,
p->result.real_ms,
p->result.user_ms,
p->result.system_ms,
dependstr,
p->label,
p->command);
else
snprintf(line, maxlen, "%-4i %-10s %-20s %-8i %0.2f/%0.2f/%0.2f %s%s\n",
p->jobid,
jobstate,
output_filename,
p->result.errorlevel,
p->result.real_ms,
p->result.user_ms,
p->result.system_ms,
dependstr,
p->command);
этот раздел: try 'sep = '\ t'' – EdChum
@EdChum, no. Использование '\ t' помещает все столбцы в один столбец – yuval
Как насчет' df = pd.read_csv ('file', sep = r '\ s {2,}', engine = 'python') '? - разделителем является регулярное выражение - '2 и более пробелов' – jezrael