2012-12-02 7 views
2

Я пытаюсь преодолеть внезапную проблему. Для этой проблемы я использовал старую виртуальную машину. Я загрузил новую виртуальную машину и все еще не могу выполнить свою работу. Я получаю Ошибка Java heap space. Я прочитал эти настройки уже один пост: out of Memory Error in HadoopCDH 4.1: Ошибка при запуске child: java.lang.OutOfMemoryError: Java heap space

Вот мои конфиги из/и т.д./Hadoop/конфа:

Судо VI hadoop-env.sh

# Extra Java runtime options. Empty by default. 
#export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true $HADOOP_CLIENT_OPTS" 
export HADOOP_CLIENT_OPTS="-Xmx256m $HADOOP_CLIENT_OPTS" 

Вот мой mapred-site.xml

<property> 
    <name>mapred.child.java.opts</name> 
    <value>-Xmx256m</value> 
    </property> 
    <property> 
    <name>io.sort.mb</name> 
    <value>128</value> 
    </property> 

Ничто не помогает :(

Вот консольный вывод:

Deleted /var/log/hadoop-yarn/apps/cloudera/logs 
12/12/02 16:31:45 INFO input.FileInputFormat: Total input paths to process : 1 
12/12/02 16:31:45 INFO mapreduce.JobSubmitter: number of splits:13 
12/12/02 16:31:45 WARN conf.Configuration: mapred.job.classpath.files is deprecated. Instead, use mapreduce.job.classpath.files 
12/12/02 16:31:45 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar 
12/12/02 16:31:45 WARN conf.Configuration: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files 
12/12/02 16:31:45 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 
12/12/02 16:31:45 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used 
12/12/02 16:31:45 WARN conf.Configuration: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 
12/12/02 16:31:45 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name 
12/12/02 16:31:45 WARN conf.Configuration: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class 
12/12/02 16:31:45 WARN conf.Configuration: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 
12/12/02 16:31:45 WARN conf.Configuration: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 
12/12/02 16:31:45 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 
12/12/02 16:31:45 WARN conf.Configuration: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 
12/12/02 16:31:45 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 
12/12/02 16:31:45 WARN conf.Configuration: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps 
12/12/02 16:31:45 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 
12/12/02 16:31:45 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 
12/12/02 16:31:46 INFO mapred.ResourceMgrDelegate: Submitted application application_1354455034384_0007 to ResourceManager at /0.0.0.0:8032 
12/12/02 16:31:46 INFO mapreduce.Job: The url to track the job: http://localhost.localdomain:8088/proxy/application_1354455034384_0007/ 
12/12/02 16:31:46 INFO mapreduce.Job: Running job: job_1354455034384_0007 
12/12/02 16:31:51 INFO mapreduce.Job: Job job_1354455034384_0007 running in uber mode : false 
12/12/02 16:31:51 INFO mapreduce.Job: map 0% reduce 0% 
12/12/02 16:32:02 INFO mapreduce.Job: Task Id : attempt_1354455034384_0007_m_000005_0, Status : FAILED 

Killed by external signal 

12/12/02 16:32:15 INFO mapreduce.Job: Task Id : attempt_1354455034384_0007_m_000006_0, Status : FAILED 
Error: Java heap space 
12/12/02 16:32:19 INFO mapreduce.Job: map 1% reduce 0% 
12/12/02 16:32:29 INFO mapreduce.Job: map 2% reduce 0% 
12/12/02 16:32:29 INFO mapreduce.Job: Task Id : attempt_1354455034384_0007_m_000005_1, Status : FAILED 
Error: Java heap space 
12/12/02 16:32:36 INFO mapreduce.Job: map 3% reduce 0% 
12/12/02 16:32:40 INFO mapreduce.Job: Task Id : attempt_1354455034384_0007_m_000006_1, Status : FAILED 
Error: Java heap space 
12/12/02 16:32:43 INFO mapreduce.Job: map 4% reduce 0% 
12/12/02 16:32:51 INFO mapreduce.Job: Task Id : attempt_1354455034384_0007_m_000005_2, Status : FAILED 
Error: Java heap space 
12/12/02 16:32:53 INFO mapreduce.Job: map 5% reduce 0% 
12/12/02 16:33:00 INFO mapreduce.Job: map 6% reduce 0% 
12/12/02 16:33:03 INFO mapreduce.Job: Task Id : attempt_1354455034384_0007_m_000006_2, Status : FAILED 
Error: Java heap space 
12/12/02 16:33:07 INFO mapreduce.Job: map 7% reduce 0% 
12/12/02 16:33:15 INFO mapreduce.Job: map 8% reduce 0% 
12/12/02 16:33:15 INFO mapreduce.Job: map 15% reduce 0% 
12/12/02 16:33:15 INFO mapreduce.Job: Job job_1354455034384_0007 failed with state FAILED due to: 
12/12/02 16:33:15 INFO mapreduce.Job: Counters: 31 
    File System Counters 
     FILE: Number of bytes read=600 
     FILE: Number of bytes written=349925 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=105310577 
     HDFS: Number of bytes written=0 
     HDFS: Number of read operations=15 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=0 
    Job Counters 
     Failed map tasks=7 
     Launched map tasks=12 
     Other local map tasks=5 
     Data-local map tasks=7 
     Total time spent by all maps in occupied slots (ms)=597080 
     Total time spent by all reduces in occupied slots (ms)=0 
    Map-Reduce Framework 
     Map input records=252675 
     Map output records=15 
     Map output bytes=992 
     Map output materialized bytes=0 
     Input split bytes=590 
     Combine input records=0 
     Spilled Records=0 
     Failed Shuffles=0 
     Merged Map outputs=0 
     GC time elapsed (ms)=272051 
     CPU time spent (ms)=112790 
     Physical memory (bytes) snapshot=1094082560 
     Virtual memory (bytes) snapshot=3678527488 
     Total committed heap usage (bytes)=780533760 
    File Input Format Counters 
     Bytes Read=105938372 

Вот мой журнал:

2012-12-02 16:36:13,072 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started 
2012-12-02 16:36:14,405 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now. 
2012-12-02 16:36:15,552 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/cloudera/appcache/application_1354455034384_0008 
2012-12-02 16:36:15,916 WARN [main] org.apache.hadoop.conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id 
2012-12-02 16:36:15,919 WARN [main] org.apache.hadoop.conf.Configuration: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap 
2012-12-02 16:36:15,920 WARN [main] org.apache.hadoop.conf.Configuration: mapred.tip.id is deprecated. Instead, use mapreduce.task.id 
2012-12-02 16:36:15,920 WARN [main] org.apache.hadoop.conf.Configuration: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition 
2012-12-02 16:36:15,921 WARN [main] org.apache.hadoop.conf.Configuration: mapred.local.dir is deprecated. Instead, use mapreduce.cluster.local.dir 
2012-12-02 16:36:15,922 WARN [main] org.apache.hadoop.conf.Configuration: job.local.dir is deprecated. Instead, use mapreduce.job.local.dir 
2012-12-02 16:36:15,922 WARN [main] org.apache.hadoop.conf.Configuration: mapred.cache.localFiles is deprecated. Instead, use mapreduce.job.cache.local.files 
2012-12-02 16:36:15,922 WARN [main] org.apache.hadoop.conf.Configuration: mapred.job.id is deprecated. Instead, use mapreduce.job.id 
2012-12-02 16:36:16,575 WARN [main] org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id 
2012-12-02 16:36:18,332 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : [email protected]32 
2012-12-02 16:36:21,168 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space 
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:912) 
    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:638) 
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:709) 
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) 
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:396) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) 
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147) 

2012-12-02 16:36:21,448 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics system... 
2012-12-02 16:36:21,448 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system stopped. 
2012-12-02 16:36:21,448 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system shutdown complete. 
(&container_1354455034384_0008_01_000005�}stderr156WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. 
stdout0syslog35152012-12-02 16:35:31,838 WARN [main] org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-maptask.properties,hadoop-metrics2.properties 
2012-12-02 16:35:32,283 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 
2012-12-02 16:35:32,283 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started 
2012-12-02 16:35:33,314 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now. 
2012-12-02 16:35:34,339 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/cloudera/appcache/application_1354455034384_0008 
2 

Что я могу попробовать дальше? Пожалуйста, помогите. Спасибо.

+0

[ '-XX: -HeapDumpOnOutOfMemoryError'] (http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html#DebuggingOptions) –

+0

Я установил это: export HADOOP_CLIENT_OPTS =" - Xmx256m -XX: -HeapDumpOnOutOfMemoryError -XX : HeapDumpPath =/home/cloudera $ HADOOP_CLIENT_OPTS ", и я не вижу дампов в указанной папке. Что я делаю неправильно? – Sergey

+0

Несколько догадок: этот каталог не существует; процесс Java не имеет прав на запись для этого каталога; процесс Hadoop не собирает '$ HADOOP_CLIENT_OPTS'; 256m просто недостаточно, поэтому попробуйте большее число. –

ответ

2

Хорошо, похоже, проблема решена. Вот мой mapred-site.xml

<configuration> 
    <property> 
    <name>mapred.job.tracker</name> 
    <value>0.0.0.0:8021</value> 
    </property> 

    <property> 
    <name>mapreduce.framework.name</name> 
    <value>yarn</value> 
    </property> 

    <property> 
    <description>To set the value of tmp directory for map and reduce tasks.</description> 
    <name>mapreduce.task.tmp.dir</name> 
    <value>/var/lib/hadoop-mapreduce/cache/${user.name}/tasks</value> 
    </property> 

    <property> 
    <name>jobtracker.thrift.address</name> 
    <value>0.0.0.0:9290</value> 
    </property> 
    <property> 
    <name>mapred.jobtracker.plugins</name> 
    <value>org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin</value> 
    <description>Comma-separated list of jobtracker plug-ins to be activated.</description> 
    </property> 
    <property> 
     <name>mapred.child.java.opts</name> 
     <value>-Xmx512m</value> 
    </property> 
    <property> 
     <name>io.sort.mb</name> 
     <value>64</value> 
    </property> 
</configuration> 

Вот мой hadoop-env.sh:

export HADOOP_CLIENT_OPTS="-Xmx512m -XX:-HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/cloudera/heapdump $HADOOP_CLIENT_OPTS" 

Я полагаю, что проблема была в применении изменений. Помог мне только перезапуск. Я попытался перезапустить службы - никакого эффекта.

+0

Правда. Проблема заключалась в применении изменений. Перезагрузка работала для меня. Спасибо! – Pradeep

1

Можно также попытаться установить mapred.job.shuffle.input.buffer.percent на 20%. По умолчанию установлено 70%, что может быть очень большим, если вы работаете с очень большим набором данных.

Попробуйте установить это свойство в mapred-site.xml:

<property> 
    <name>mapred.job.shuffle.input.buffer.percent</name> 
    <value>0.20</value> 
</property> 

Пожалуйста, обратитесь к этому документу еще какое-то настройки производительности: http://developer.amd.com.php53-23.ord1-1.websitetestlink.com/wordpress/media/2012/10/Hadoop_Tuning_Guide-Version5.pdf

Надежда кто-то считает это полезным :)