2017-02-18 15 views
0

Я пытаюсь соединиться с улей apache с помощью apache spark с помощью java-программы. Вот программа:Почему я не могу соединиться с метафорой улья, используя искру apache?

import org.apache.spark.sql.SparkSession; 

public class queryhive { 

    public static void main(String[] args) 
{ 
    String warehouseLocation = "spark-warehouse"; 

    SparkSession spark = SparkSession 
      .builder() 
      .appName("Java Spark Hive Example") 
      .master("local[*]") 
      .config("spark.sql.warehouse.dir", warehouseLocation) 
      .enableHiveSupport() 
      .getOrCreate(); 
try 
{ 
     spark.sql("select count(*) from heath1").show(); 
} 
catch (Exception AnalysisException) 
{ 
    System.out.print("\nTable is not found\n"); 
} 
} 
} 

Я добавил к Maven pom.xml: адрес HDFS и адрес улья в <properties> тега.
Я хотел бы запросить таблицы улья, используя искру. Но я не мог видеть таблицы, так как я получаю исключение из таблицы не найден:

log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell). 
log4j:WARN Please initialize the log4j system properly. 
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
17/02/18 11:30:56 INFO SparkContext: Running Spark version 2.1.0 
17/02/18 11:30:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
17/02/18 11:30:56 WARN Utils: Your hostname, aims resolves to a loopback address: 127.0.1.1; using 10.0.0.3 instead (on interface wlp2s0) 
17/02/18 11:30:56 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 
17/02/18 11:30:56 INFO SecurityManager: Changing view acls to: aims 
17/02/18 11:30:56 INFO SecurityManager: Changing modify acls to: aims 
17/02/18 11:30:56 INFO SecurityManager: Changing view acls groups to: 
17/02/18 11:30:56 INFO SecurityManager: Changing modify acls groups to: 
17/02/18 11:30:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(aims); groups with view permissions: Set(); users with modify permissions: Set(aims); groups with modify permissions: Set() 
17/02/18 11:30:57 INFO Utils: Successfully started service 'sparkDriver' on port 32975. 
17/02/18 11:30:57 INFO SparkEnv: Registering MapOutputTracker 
17/02/18 11:30:57 INFO SparkEnv: Registering BlockManagerMaster 
17/02/18 11:30:57 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 
17/02/18 11:30:57 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 
17/02/18 11:30:57 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-6263f04a-5c65-4dda-9e9a-faafb32a066a 
17/02/18 11:30:57 INFO MemoryStore: MemoryStore started with capacity 335.4 MB 
17/02/18 11:30:57 INFO SparkEnv: Registering OutputCommitCoordinator 
17/02/18 11:30:58 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
17/02/18 11:30:58 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.0.3:4040 
17/02/18 11:30:58 INFO Executor: Starting executor ID driver on host localhost 
17/02/18 11:30:58 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43772. 
17/02/18 11:30:58 INFO NettyBlockTransferService: Server created on 10.0.0.3:43772 
17/02/18 11:30:58 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 
17/02/18 11:30:58 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.0.0.3, 43772, None) 
17/02/18 11:30:58 INFO BlockManagerMasterEndpoint: Registering block manager 10.0.0.3:43772 with 335.4 MB RAM, BlockManagerId(driver, 10.0.0.3, 43772, None) 
17/02/18 11:30:58 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.0.0.3, 43772, None) 
17/02/18 11:30:58 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.0.0.3, 43772, None) 
17/02/18 11:30:58 INFO SharedState: Warehouse path is 'hdfs://localhost:8020/user/hive/warehouse/default.db/spark-warehouse'. 
17/02/18 11:30:58 INFO HiveUtils: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes. 
17/02/18 11:30:59 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 
17/02/18 11:30:59 INFO ObjectStore: ObjectStore, initialize called 
17/02/18 11:31:00 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored 
17/02/18 11:31:00 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored 
17/02/18 11:31:02 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" 
17/02/18 11:31:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 
17/02/18 11:31:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 
17/02/18 11:31:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 
17/02/18 11:31:03 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 
17/02/18 11:31:03 INFO Query: Reading in results for query "[email protected]" since the connection used is closing 
17/02/18 11:31:03 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY 
17/02/18 11:31:03 INFO ObjectStore: Initialized ObjectStore 
17/02/18 11:31:05 INFO HiveMetaStore: Added admin role in metastore 
17/02/18 11:31:05 INFO HiveMetaStore: Added public role in metastore 
17/02/18 11:31:05 INFO HiveMetaStore: No user is added in admin role, since config is empty 
17/02/18 11:31:05 INFO HiveMetaStore: 0: get_all_databases 
17/02/18 11:31:05 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_all_databases 
17/02/18 11:31:05 INFO HiveMetaStore: 0: get_functions: db=default pat=* 
17/02/18 11:31:05 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_functions: db=default pat=* 
17/02/18 11:31:05 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table. 
17/02/18 11:31:06 INFO SessionState: Created local directory: /tmp/cac4110a-ebb3-47a6-b21e-682a12724ba2_resources 
17/02/18 11:31:06 INFO SessionState: Created HDFS directory: /tmp/hive/aims/cac4110a-ebb3-47a6-b21e-682a12724ba2 
17/02/18 11:31:06 INFO SessionState: Created local directory: /tmp/aims/cac4110a-ebb3-47a6-b21e-682a12724ba2 
17/02/18 11:31:06 INFO SessionState: Created HDFS directory: /tmp/hive/aims/cac4110a-ebb3-47a6-b21e-682a12724ba2/_tmp_space.db 
17/02/18 11:31:06 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is hdfs://localhost:8020/user/hive/warehouse/default.db/spark-warehouse 
17/02/18 11:31:06 INFO HiveMetaStore: 0: get_database: default 
17/02/18 11:31:06 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_database: default 
17/02/18 11:31:06 INFO HiveMetaStore: 0: get_database: global_temp 
17/02/18 11:31:06 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_database: global_temp 
17/02/18 11:31:06 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException 
17/02/18 11:31:06 INFO SparkSqlParser: Parsing command: select count(*) from health1 
17/02/18 11:31:08 INFO HiveMetaStore: 0: get_table : db=default tbl=health1 
17/02/18 11:31:08 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_table : db=default tbl=health1 
17/02/18 11:31:08 INFO HiveMetaStore: 0: get_table : db=default tbl=health1 
17/02/18 11:31:08 INFO audit: ugi=aims ip=unknown-ip-addr cmd=get_table : db=default tbl=health1 

Table is not found 
17/02/18 11:31:08 INFO SparkContext: Invoking stop() from shutdown hook 
17/02/18 11:31:08 INFO SparkUI: Stopped Spark web UI at http://10.0.0.3:4040 
17/02/18 11:31:08 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 
17/02/18 11:31:08 INFO MemoryStore: MemoryStore cleared 
17/02/18 11:31:08 INFO BlockManager: BlockManager stopped 
17/02/18 11:31:08 INFO BlockManagerMaster: BlockManagerMaster stopped 
17/02/18 11:31:08 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 
17/02/18 11:31:08 INFO SparkContext: Successfully stopped SparkContext 
17/02/18 11:31:08 INFO ShutdownHookManager: Shutdown hook called 
17/02/18 11:31:08 INFO ShutdownHookManager: Deleting directory /tmp/spark-ea93f7ec-6151-43e9-b5d9-bedbba537d62 

Я использую Apache улей 1.2.0 и 2.1.0 Спарк
Я считаю, что проблема не из-версий .Используйте Eclipse Neon как IDE.Подробнее, дайте мне знать, почему я столкнулся с этой проблемой и как я могу ее решить.

ответ

0

Необходимо указать имя схемы. Либо, как select * from schmaName.tableName или, как показано ниже

try 
{ 
     spark.sql("use schemaName")   // name of the schema 
     spark.sql("select count(*) from heath1").show(); 
} 
catch (Exception AnalysisException) 
{ 
    System.out.print("\nTable is not found\n"); 
} 
+0

пытался .. не работает .. –

+0

Каково содержание местоположения в HDFS -/пользователь/улей/склад ??? Можете ли вы найти свою схему, а затем таблицу в этом месте? –

+0

его файл - comment.csv, который имеет все данные. –

 Смежные вопросы

  • Нет связанных вопросов^_^