ES数据处理方法

news/2024/7/7 6:40:32 标签: elasticsearch, java, 前端

由于日志数据存在ES项目里,需要从ES中获取日志进行分析,使用SQL数据进行处理,如下:

select

     traceid--    STRING   COMMENT '流程id',

    ,appnum  --   BIGINT   COMMENT '迭代号',

    ,appversion --STRING   COMMENT 'APP版本',    

    ,appcode   -- STRING   COMMENT '应用编码',

    ,type     --  STRING   COMMENT '类型',

    ,spanid   --  STRING   COMMENT '模块id',  

    ,apptype   -- STRING   COMMENT '应用类型详情见定义',

    ,eventtime -- DATETIME COMMENT '日期',

    ,name      -- STRING   COMMENT '名称',

    ,id        -- STRING   COMMENT 'id',

    ,theid     -- STRING   COMMENT 'theId'  

    ,preid

    -------------data----------------    

    ,GET_JSON_OBJECT(data_tmp,'$.allInOne') AS allInOne  

    ,GET_JSON_OBJECT(data_tmp,'$.class')      AS class

    ,GET_JSON_OBJECT(data_tmp,'$.classCode')      AS classCode

    ,GET_JSON_OBJECT(data_tmp,'$.deviceId')     AS deviceId

    ,GET_JSON_OBJECT(data_tmp,'$.grade')   AS grade

    ,GET_JSON_OBJECT(data_tmp,'$.gradeCode') AS gradeCode

    ,GET_JSON_OBJECT(data_tmp,'$.handleTime') AS handleTime

    ,GET_JSON_OBJECT(data_tmp,'$.heigth') AS heigth

    ,cast(ipint(GET_JSON_OBJECT(json_build,'$.ip')) as string) AS ipAddr

    ,GET_JSON_OBJECT(data_tmp,'$.isSuccess') AS isSuccess     --isSuccess(1.是 0否)

    ,GET_JSON_OBJECT(data_tmp,'$.loginMode') AS loginMode     -- 登陆模式 1 游客登陆 2 账户登陆

    ,GET_JSON_OBJECT(data_tmp,'$.loginType') AS loginType     -- 登陆方式 1:在线登陆 2 离线登陆

    ,GET_JSON_OBJECT(data_tmp,'$.school') AS school

    ,GET_JSON_OBJECT(data_tmp,'$.schoolCode') AS schoolCode  

    ,GET_JSON_OBJECT(data_tmp,'$.width') AS width

    ,GET_JSON_OBJECT(data_tmp,'$.subject') AS subject

    ,GET_JSON_OBJECT(data_tmp,'$.subjectCode') AS subjectCode

    ,GET_JSON_OBJECT(data_tmp,'$.classTime') AS classTime

    ,GET_JSON_OBJECT(data_tmp,'$.reason') AS reason

    ,GET_JSON_OBJECT(data_tmp,'$.operateVersion') AS operateVersion  

    ----------新增---------

    ,CASE WHEN GET_JSON_OBJECT(data_tmp,'$.userId')  is not NULL THEN GET_JSON_OBJECT(data_tmp,'$.userId')

          WHEN GET_JSON_OBJECT(data_tmp,'$.teacherCode') is not null THEN  GET_JSON_OBJECT(data_tmp,'$.teacherCode')

          ELSE  GET_JSON_OBJECT(data_tmp,'$.userId')

     END  AS userId          

    ,GET_JSON_OBJECT(data_tmp,'$.userName') AS userName

    ,GET_JSON_OBJECT(data_tmp,'$.userType') AS userType

    ,GET_JSON_OBJECT(data_tmp,'$.account') AS account

    ,GET_JSON_OBJECT(data_tmp,'$.courseId') AS courseId  

    ,GET_JSON_OBJECT(data_tmp,'$.pageName') AS pageName  

    ,GET_JSON_OBJECT(data_tmp,'$.pageTitle') AS pageTitle  

    ,CASE    WHEN GET_JSON_OBJECT(data_tmp,'$.describe') is not NULL  THEN GET_JSON_OBJECT(data_tmp,'$.describe')

             WHEN GET_JSON_OBJECT(data_tmp,'$.eventDesc') is not NULL  THEN GET_JSON_OBJECT(data_tmp,'$.eventDesc')

             ELSE  ''

             END AS  describes    

    ,GET_JSON_OBJECT(data_tmp,'$.source') AS source      

    ,GET_JSON_OBJECT(data_tmp,'$.topDistance') AS topDistance  

    ,GET_JSON_OBJECT(data_tmp,'$.size') AS sizes  

   ---------------json_build--------------------

    ,GET_JSON_OBJECT(json_build,'$.sysVersion')   AS sysVersion    

    ,GET_JSON_OBJECT(json_build,'$.cpuType')     AS cpuType  

    ,GET_JSON_OBJECT(json_build,'$.memory') AS memory

    ,GET_JSON_OBJECT(json_build,'$.netType') AS netType

    ,GET_JSON_OBJECT(json_build,'$.sysName') AS sysName

    ,GET_JSON_OBJECT(json_build,'$.deviceModel') AS deviceModel

    ,GET_JSON_OBJECT(json_build,'$.deviceNo') AS deviceNo  

   -------------------新增--------------------

   ,GET_JSON_OBJECT(json_build,'$.screenHeight') AS screenHeight  

   ,GET_JSON_OBJECT(json_build,'$.screenWidth') AS screenWidth  

   ,GET_JSON_OBJECT(json_build,'$.browserName') AS browserName

   ,GET_JSON_OBJECT(json_build,'$.browserVersion') AS browserVersion

   ,GET_JSON_OBJECT(json_build,'$.browserWidth') AS browserWidth

   ,GET_JSON_OBJECT(json_build,'$.browserHeight') AS browserHeight

   ,GET_JSON_OBJECT(json_build,'$.ip') AS ip

   ,GET_JSON_OBJECT(json_build,'$.remoteIp') AS remoteIp

   ,GET_JSON_OBJECT(data_tmp,'$.actionName') AS actionName

   ,GET_JSON_OBJECT(data_tmp,'$.finishStatus') AS finishStatus

   ,GET_JSON_OBJECT(data_tmp,'$.isFirst') AS isFirst

   ,GET_JSON_OBJECT(data_tmp,'$.bankType') AS bankType

   ,GET_JSON_OBJECT(data_tmp,'$.book') AS book

   ,GET_JSON_OBJECT(data_tmp,'$.mode') AS mode

   ,GET_JSON_OBJECT(data_tmp,'$.chapter') AS chapter

   ,GET_JSON_OBJECT(data_tmp,'$.result') AS result

   ,GET_JSON_OBJECT(data_tmp,'$.knowledgeCount') AS knowledgeCount

   ,GET_JSON_OBJECT(data_tmp,'$.questCount') AS questCount

   ,GET_JSON_OBJECT(data_tmp,'$.scoreType') AS scoreType

   ,GET_JSON_OBJECT(data_tmp,'$.scoreModule') AS scoreModule

   ,GET_JSON_OBJECT(data_tmp,'$.appName') AS appName

   ,GET_JSON_OBJECT(data_tmp,'$.voteNumber') AS voteNumber

   ,GET_JSON_OBJECT(data_tmp,'$.perVoteNubmer') AS perVoteNubmer

   ,GET_JSON_OBJECT(data_tmp,'$.type') AS attributeType

--- 新增 2022-12-09 ----

   ,GET_JSON_OBJECT(data_tmp,'$.loginTypeName') AS loginTypeName

   ,GET_JSON_OBJECT(data_tmp,'$.name') AS noteName

   ,GET_JSON_OBJECT(data_tmp,'$.notes') AS notes

   ,GET_JSON_OBJECT(data_tmp,'$.pageNum') AS pageNum

   ,GET_JSON_OBJECT(data_tmp,'$.color') AS color  

   ,GET_JSON_OBJECT(data_tmp,'$.event') AS event

   ,GET_JSON_OBJECT(data_tmp,'$.date') AS  switchDate

   ,GET_JSON_OBJECT(data_tmp,'$.input') AS  inputValue  

   ,GET_JSON_OBJECT(data_tmp,'$.title') AS  title

   ,GET_JSON_OBJECT(data_tmp,'$.fileName') AS  fileName   --文件名

-- 1.文档:doc、docx、PDF

-- 2.音频:WAV、ape、AIFF、CD、AU、MP3、WMA、VQF、FLAC、MIDI、Ogg、U-Law、VOC、aac、RA/.RM/.RAM

-- 3.视频:avi、MOV/.QT、MKV、MP4、WMV、MPEG、BD、HDVD、RMVB、PROPER、R5、Watermarks、TS、DAT、SWF、ASF、3GP、FLV、HDRIP、IMAX

-- 4.课件:ppt、pptx、pps、ppsx、ppa、ppam、pot、potx、thmx

-- 5.图片:Webp、BMP、PCX、TIF、GIF、JPEG、TGA、EXIF、FPX、SVG、PSD、CDR、PCD、DXF、UFO、EPS、AI、PNG、HDRI、RAW、WMF、FLIC、EMF、ICO

-- 6.表格:xls、csv、CSS、XPS、xlsm、et、

-- 7.压缩包:RAR、ZIP、ARJ、Z、LZH、JAR

-- 8.其他

   ,GET_JSON_OBJECT(data_tmp,'$.fileId') AS  fileId      

   ,GET_JSON_OBJECT(data_tmp,'$.fileNames') AS  fileNames  

   ,GET_JSON_OBJECT(data_tmp,'$.beginDate') AS  beginDate    

   ,GET_JSON_OBJECT(data_tmp,'$.endDate') AS  endDate  

   ,GET_JSON_OBJECT(data_tmp,'$.questionId') AS  questionId   --题号

   ,GET_JSON_OBJECT(data_tmp,'$.packageName') AS  packageName  

   ,GET_JSON_OBJECT(data_tmp,'$.versionName') AS  versionName    

   ,GET_JSON_OBJECT(data_tmp,'$.versionCode') AS  versionCode

   ,GET_JSON_OBJECT(data_tmp,'$.jobId') AS  jobId

   ,GET_JSON_OBJECT(data_tmp,'$.answer') AS  answer

   ,GET_JSON_OBJECT(data_tmp,'$.wrong') AS  wrong    

   ,GET_JSON_OBJECT(data_tmp,'$.correct') AS  correct

   ,GET_JSON_OBJECT(data_tmp,'$.unanswered') AS  unanswered  

   ,GET_JSON_OBJECT(data_tmp,'$.finishNumber') AS  finishNumber  

   ,GET_JSON_OBJECT(data_tmp,'$.totalNumber') AS  totalNumber  

   ,GET_JSON_OBJECT(data_tmp,'$.word') AS  word  

   ,GET_JSON_OBJECT(data_tmp,'$.msg') AS  msg  

   ,GET_JSON_OBJECT(data_tmp,'$.count') AS  impCount  

   ,GET_JSON_OBJECT(json_build,'$.pageHeight') AS pageHeight   --页面高度

   ,GET_JSON_OBJECT(data_tmp,'$.answers') AS  answers --答题情况

   --新增--

   ,GET_JSON_OBJECT(data_tmp,'$.num') AS  num --题目数量

   ,GET_JSON_OBJECT(data_tmp,'$.op') AS  op --随机选人 选项 op(清除(NULL)、A、B、C) op(不随机(0)、1、2、3)

   ,GET_JSON_OBJECT(data_tmp,'$.leaveTime') AS  leaveTime --收卷倒计时

   ,GET_JSON_OBJECT(data_tmp,'$.examId') AS  examId --考试id

   ,GET_JSON_OBJECT(data_tmp,'$.id') AS  idCode --id,用英文逗号隔开,组code

   ,GET_JSON_OBJECT(data_tmp,'$.code') AS  code --对调学生

   ,GET_JSON_OBJECT(data_tmp,'$.rol') AS  rol --对调学生 位置rol

   ,GET_JSON_OBJECT(data_tmp,'$.col') AS  col --对调学生 位置col

   ,GET_JSON_OBJECT(data_tmp,'$.stage') AS  stage --学段

   ,GET_JSON_OBJECT(data_tmp,'$.version') AS  versions --学段

--    ,GET_JSON_OBJECT(data_tmp,'$.type') AS  见 attributeType --类型 type(1.批注作答 2.画板作答)

  -- ,GET_JSON_OBJECT(data_tmp,'$.actionName') AS actionName      --活动名称

  -- ,GET_JSON_OBJECT(data_tmp,'$.answer') AS  answer    --答案

   --,GET_JSON_OBJECT(data_tmp,'$.color') AS  color --颜色

  -- ,GET_JSON_OBJECT(data_tmp,'$.finishNumber') AS  finishNumber --找到的数量

   --,GET_JSON_OBJECT(data_tmp,'$.totalNumber') AS  totalNumber --总词数

  -- ,GET_JSON_OBJECT(data_tmp,'$.word') AS  word --未答数

  -- `completionStatus` varchar(100) DEFAULT NULL COMMENT '完成情况',

 -- ,GET_JSON_OBJECT(data_tmp,'$.event') AS  event --收起/展开事件

  --,GET_JSON_OBJECT(data_tmp,'$.fileId') AS  fileId --云端文件ID

  --,GET_JSON_OBJECT(data_tmp,'$.fileNames') AS  fileNames --文件名列表

  --,GET_JSON_OBJECT(data_tmp,'$.fileName') AS  fileName --文件名

  --,GET_JSON_OBJECT(data_tmp,'$.date') AS    dates --日期筛选类型

  --,GET_JSON_OBJECT(data_tmp,'$.result') AS  isfinish --完成情况 result:true/false

 -- ,GET_JSON_OBJECT(data_tmp,'$.input') AS  inputValue --输入值

  --,GET_JSON_OBJECT(data_tmp,'$.jobId') AS  jobId --作业ID

  --,GET_JSON_OBJECT(data_tmp,'$.name') AS  name --名称

  --,GET_JSON_OBJECT(data_tmp,'$.questionId') AS  questionId --题目ID

  --`jobType` varchar(50) DEFAULT NULL COMMENT '作业类型',

  --`noteName` varchar(50) DEFAULT NULL COMMENT '笔记本名称',

  --`signName` varchar(100) DEFAULT NULL COMMENT '标签名',  

  --`switchDate` datetime DEFAULT NULL COMMENT '日期切换日期值',

  --`thickNess` varchar(10) DEFAULT NULL COMMENT '粗细值',

  --`timeSlot` varchar(10) DEFAULT NULL COMMENT '时间段',

  --`toolName` varchar(50) DEFAULT NULL COMMENT '工具名称',

  --`wrongBookName` varchar(50) DEFAULT NULL COMMENT '错题本名称',

--  ,GET_JSON_OBJECT(json_build,'$.loginTypeName') AS appcode

--    ,GET_JSON_OBJECT(json_build,'$.appVersion') AS appVersion

,createtime  --创建时间

from (

select  theid,

                    id,

                    name,

                    eventtime,

                    apptype,

                     regexp_replace(regexp_replace(regexp_replace(build,'^\\[',''),'\\]$',''),'},\\{','}|{') AS json_build,

                    spanid,

                    type,

                    appcode,

                    regexp_replace(regexp_replace(regexp_replace(data,'^\\[',''),'\\]$',''),'},\\{','}|{') AS json_data,

                    appversion,

                    appnum,

                    preid,

                    traceid,

                    createtime

                     from  dw_es_action_log_inc_new

                    WHERE

                   --  DATETRUNC(eventtime,'hh') >=  DATETRUNC(dateadd(TO_DATE('${cyctime}','yyyymmddhhmiss'), -1, 'hh'),'hh')     or    

                   

                     DATETRUNC(createtime,'DD') >=  DATETRUNC(TO_DATE('${bizdate}','yyyymmdd'),'DD')  --测试使用            

) a0

lateral view explode(split(json_data,'\\|')) b AS data_tmp;

从中可以发现,针对很多不同格式的 数据,可以进行这种分解处理。

select bh,bjmc,nj,xxbm,xxmc,xnid,xn,xnmc,xd,rnk from (SELECT bh,bjmc,nj,xxbm,xxmc,xnid,xn,xnmc,xd, Row_Number() OVER (partition by bh,xxbm ORDER BY nj desc) rnk

FROM dw_class where  zt='1'  and  bjlxm = '1' and xnid <> '' ) aa where rnk='1'  

同时可以使用Row_Number,进行数据处理,获取最大年级数据。


http://www.niftyadmin.cn/n/5347348.html

相关文章

【yaml 文件使用】pytest+request 框架中 yaml 配置文件使用

又来进步一点点~~ 背景&#xff1a;最近在学习pytestrequest框架写接口测试自动化&#xff0c;使用yaml文件配置更方便管理用例中的数据&#xff0c;这样更方便 yaml 介绍&#xff1a; 什么是 yaml 文件&#xff1a;YAML 是 “YAML Ain’t a Markup Language”&#xff08;Y…

电商系统设计到开发03 引入Kafka异步削峰

一、前言 系统设计&#xff1a;电商系统设计到开发01 第一版设计到编码-CSDN博客 接着上篇文章&#xff1a;电商系统设计到开发02 单机性能压测-CSDN博客 本篇为大制作&#xff0c;内容有点多&#xff0c;也比较干货&#xff0c;希望可以耐心看看 已经开发的代码&#xff0…

Docker 容器内运行 mysqldump 命令来导出 MySQL 数据库,自动化备份

备份容器数据库命令&#xff1a; docker exec 容器名称或ID mysqldump -u用户名 -p密码 数据库名称 > 导出文件.sql请替换以下占位符&#xff1a; 容器名称或ID&#xff1a;您的 MySQL 容器的名称或ID。用户名&#xff1a;您的 MySQL 用户名。密码&#xff1a;您的 MySQL …

支持IPv4与IPv6双协议栈的串口服务器,IPv6串口服务器

物联网是啥玩意儿&#xff1f;这是首先要搞明白的。按照百度百科的说法&#xff0c;是将各种信息传感设备&#xff0c;如射频识别&#xff08;RFID&#xff09;装置、红外感应器、全球定位系统、激光扫描器等种种装置与互联网结合起来而形成的一个巨大网络。这个说法有些复杂&a…

MySQL和Redis的事务有什么异同?

MySQL和Redis是两种不同类型的数据库管理系统&#xff0c;它们在事务处理方面有一些重要的异同点。 MySQL事务&#xff1a; ACID属性&#xff1a; MySQL是一个关系型数据库管理系统&#xff08;RDBMS&#xff09;&#xff0c;支持ACID属性&#xff0c;即原子性&#xff08;Ato…

10.Elasticsearch应用(十)

Elasticsearch应用&#xff08;十&#xff09; 1.为什么需要聚合操作 聚合可以让我们极其方便的实现对数据的统计、分析、运算&#xff0c;例如&#xff1a; 什么品牌的手机最受欢迎&#xff1f;这些手机的平均价格、最高价格、最低价格&#xff1f;这些手机每月的销售情况如…

【C/C++】C/C++编程——第一个 C++ 程序:HelloWorld

第一个 C 程序&#xff1a;HelloWorld 大家好&#xff0c;我是 shopeeai&#xff0c;也可以叫我虾皮&#xff0c;中科大菜鸟研究生。昨天我们成功搭建好了 C 的开发环境&#xff0c;今天我们来介绍一下第一个 C 程序,打印一个"hello world"。首先我们先贴一下示例代…

【零基础】学python数据结构与算法笔记(目录版)

【零基础】学python数据结构与算法笔记1 1.算法入门概念2.估计算法运行效率与时间复杂度3.简单判断时间复杂度4.空间复杂度5.递归6.汉诺塔问题 【零基础】学python数据结构与算法笔记2 7.顺序查找8.二分查找介绍9.二分查找代码10.二分查找与线性查找的比较11.排序介绍12.冒泡…