2015-08-25 7 views
0

Я хотел обработать твиттер-json-объект с помощью свиньи, используя банки с птицами-слонами, для которых я написал сценарий свиньи, как показано ниже.Ошибка обработки комплекса json объекта twitter со свиньей JsonLoader() из башен с птицей-слоном

REGISTER '/usr/lib/pig/lib/elephant-bird-hadoop-compat-4.1.jar'; 
 
REGISTER '/usr/lib/pig/lib/elephant-bird-pig-4.1.jar'; 
 

 
A = LOAD '/user/flume/tweets/data.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS myMap; 
 
B = FOREACH A GENERATE myMap#'id' AS ID,myMap#'created_at' AS createdAT; 
 
DUMP B;

который дал мне ошибку ниже

2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1439883208520_0177 
 
2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A,B 
 
2015-08-25 11:06:34,295 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[3,4],B[4,4] C: R: 
 
2015-08-25 11:06:34,303 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 
 
2015-08-25 11:06:34,303 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1439883208520_0177] 
 
2015-08-25 11:07:06,449 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete 
 
2015-08-25 11:07:06,449 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1439883208520_0177] 
 
2015-08-25 11:07:09,458 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure. 
 
2015-08-25 11:07:09,458 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1439883208520_0177 has failed! Stop running all dependent jobs 
 
2015-08-25 11:07:09,459 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 
 
2015-08-25 11:07:09,667 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://trinityhadoopmaster.com:8188/ws/v1/timeline/ 
 
2015-08-25 11:07:09,668 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at trinityhadoopmaster.com/192.168.1.135:8032 
 
2015-08-25 11:07:09,678 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server 
 
2015-08-25 11:07:09,779 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.ClassNotFoundException: org.json.simple.parser.ParseException 
 
2015-08-25 11:07:09,779 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed! 
 
2015-08-25 11:07:09,780 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 
 

 
HadoopVersion PigVersion  UserId StartedAt  FinishedAt  Features 
 
2.6.0 0.14.0 hdfs 2015-08-25 11:06:33  2015-08-25 11:07:09  UNKNOWN 
 

 
Failed! 
 

 
Failed Jobs: 
 
JobId Alias Feature Message Outputs 
 
job_1439883208520_0177 A,B  MAP_ONLY  Message: Job failed! hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559, 
 

 
Input(s): 
 
Failed to read data from "hdfs://trinityhadoopmaster.com:9000/user/flume/tweets/data.json" 
 

 
Output(s): 
 
Failed to produce result in "hdfs://trinityhadoopmaster.com:9000/tmp/temp1554332510/tmp835744559" 
 

 
Counters: 
 
Total records written : 0 
 
Total bytes written : 0 
 
Spillable Memory Manager spill count : 0 
 
Total bags proactively spilled: 0 
 
Total records proactively spilled: 0 
 

 
Job DAG: 
 
job_1439883208520_0177 
 

 

 
2015-08-25 11:07:09,780 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed! 
 
2015-08-25 11:07:09,787 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias B. Backend error : java.lang.ClassNotFoundException: org.json.simple.parser.ParseException 
 
Details at logfile: /tmp/pig-err.log 
 
grunt>

, которые я не имею ни малейшего понятия о том, как подойти, может ли один помочь мне в этом.

+0

Для людей, которые нашли это сообщение при поиске [ERROR 1066: Не удалось открыть итератор для псевдонима] (http://stackoverflow.com/questions/34495085/error-1066-unable-to-open-iterator-for- alias-in-pig-generic-solution) здесь [общее решение] (http://stackoverflow.com/a/34495086/983722). –

ответ

0
REGISTER '/tmp/elephant-bird-core-4.1.jar'; 

REGISTER '/tmp/elephant-bird-pig-4.1.jar'; 

REGISTER '/tmp/elephant-bird-hadoop-compat-4.1.jar'; 

REGISTER '/tmp/google-collections-1.0.jar'; 

REGISTER '/tmp/json-simple-1.1.jar'; 

Это работает.

 Смежные вопросы

  • Нет связанных вопросов^_^