How to Fix Elastic Search Spark SQL Exception “org.elasticsearch.hadoop.EsHadoopIllegalStateException: Field not found; typically this occurs with arrays which are not mapped as single value”

The Actual ElasticSearch Index Data is nested as below.

|– VendorInfo: struct (nullable = true)
|    |– VendorAddress: string (nullable = true)
|    |– VendorAlternativeEmail: string (nullable = true)
|    |– VendorCellphone: string (nullable = true)
|    |– VendorDescription: string (nullable = true)
|    |– VendorEmail: string (nullable = true)
|    |– VendorHomePhone: string (nullable = true)
|    |– VendorName: string (nullable = true)
|    |– VendorRole: string (nullable = true)
|    |– VendorTaxId: string (nullable = true)
|    |– VendorWorkPhone: string (nullable = true)

I tried to read the VendorInfo value through Spark SQL DataFrames as below.
val esConfig = Map((“es.nodes”,”localhost”),(“es.port”,”9200″),(“es.index.auto.create”,”false”),(“es.http.timeout”, “5m”))

val readEsIndex = sqlContext.read.format(“org.elasticsearch.spark.sql”).options(esConfig).load(“indexname/type”)
readEsIndex.registerTempTable(“readEsIndex”)
val hivedf = sqlContext.sql(“select VendorInfo from readEsIndex”)

I got Exception as below:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 7.0 failed 4 times, most recent failure: Lost task 0.3 in stage 7.0 (TID 40, n01bdl603.aap.csaa.pri): org.elasticsearch.hadoop.EsHadoopIllegalStateException: Field ‘VendorTaxId’ not found; typically this occurs with arrays which are not mapped as single value.

How did I resolve this?
1.Added “es.read.field.array” configuration in the esConfig

2.Added the below version of ElasticSearch Spark package
libraryDependencies += “org.elasticsearch” % “elasticsearch-spark_2.10” % “2.2.0”

3.Added below imports
import org.elasticsearch.hadoop.cfg.ConfigurationOptions._
import org.elasticsearch.hadoop.cfg.PropertiesSettings
import org.elasticsearch.hadoop.util.StringUtils
import org.elasticsearch.spark.cfg._
import org.elasticsearch.spark.sql._
import org.elasticsearch.spark.sql.api.java.JavaEsSparkSQL
import org.elasticsearch.spark.sql.sqlContextFunctions
import org.elasticsearch.hadoop.EsHadoopIllegalArgumentException
import org.elasticsearch.hadoop.serialization.EsHadoopSerializationException
import org.apache.spark.sql.types.DoubleType

4.Re-tried as below:
val esConfig = Map((“es.nodes”,”localhost”),(“es.port”,”9200″),(“es.index.auto.create”,”false”),(“es.http.timeout”, “5m”),(“es.read.field.as.array”,”VendorInfo.VendorAddress,VendorInfo.VendorAlternativeEmail,VendorInfo.VendorCellphone,VendorInfo.VendorDescription,VendorInfo.VendorEmail,VendorInfo.VendorHomePhone,VendorInfo.VendorName,VendorInfo.VendorRole,VendorInfo.VendorTaxId,VendorInfo.VendorWorkPhone”))

val readEsIndex = sqlContext.read.format(“org.elasticsearch.spark.sql”).options(esConfig).load(“indexname/type”)
readEsIndex.registerTempTable(“readEsIndex”)
val hivedf = sqlContext.sql(“select VendorInfo from readEsIndex”)

It worked Fine and returned the Records.

Advertisements

One thought on “How to Fix Elastic Search Spark SQL Exception “org.elasticsearch.hadoop.EsHadoopIllegalStateException: Field not found; typically this occurs with arrays which are not mapped as single value”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s