Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mac单机模式运行YDB #1

Open
zqhxuyuan opened this issue Nov 30, 2015 · 0 comments
Open

Mac单机模式运行YDB #1

zqhxuyuan opened this issue Nov 30, 2015 · 0 comments

Comments

@zqhxuyuan
Copy link

YDB单机模式

启动ZK和HDFS

zkServer.sh
start-dfs.sh

修改conf/storm.yaml

 storm.zookeeper.servers:
     - "localhost"
 storm.zookeeper.port: 2181
 storm.zookeeper.root: "/ycloud/ydb_zkroot"


 ####本地工作配置####
 storm.local.dir: "/Users/zhengqh/data/ydb"

 ####nimbus master的配置####
 nimbus.host: "127.0.0.1"

storm.local.dir 这个请修改成自己本地磁盘.
storm.zookeeper.root不需要更改
nimbus.host不能使用localhost, 否则后面启动nimbus时没有反应.

修改ydb_site.xml

 ###配置hdfs的路径
 ydb.hdfs.path: "/data/ycloud/ydb/ydbpath"
 ###配置本地的hadoop conf目录###
 hadoop.conf.dir: "/Users/zhengqh/Soft/cdh542/hadoop-2.6.0-cdh5.4.2/etc/hadoop"

 ###配置导入的原始数据的hdfs目录###
 ydb.reader.rawdata.hdfs.path: "/data/ycloud/ydb/rawdata"

hadoop.conf.dir为本机Hadoop的路径, 其他路径不需要更改(都是HDFS路径).

修改bin下ydb的权限为777

在我的mac下,如果没有修改权限,则下面启动的命令会报错没有权限

启动集群

cd /Users/zhengqh/Soft/ycloud/ydb/bin
nohup ./ydb nimbus >nimbus.log 2>&1 &
nohup ./ydb supervisor >supervisor.log 2>&1 &
nohup ./ydb http 8080 >http.log 2>&1 &

ydb-1 ydb-5

启动Worker

./ydb start

ydb-2

查看进程:

➜  ydb  jps -l
5163 com.alipay.bluewhale.core.work.Worker
5023 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
5165 com.alipay.bluewhale.core.work.Worker
3415 org.apache.zookeeper.server.quorum.QuorumPeerMain
5137 cn.net.ycloud.ydb.server.YdbHttpServer
4850 org.apache.hadoop.hdfs.server.namenode.NameNode
5056 com.alipay.bluewhale.core.daemon.NimbusServer
5124 com.alipay.bluewhale.core.daemon.supervisor.Supervisor
4922 org.apache.hadoop.hdfs.server.datanode.DataNode

建表

ydb-3

表结构会存储在HDFS上:

ydb-4

上传示例文件到HDFS

➜  bin  hadoop fs -put ../data_example/jsondata_example.txt /data/jsondata_example.txt

➜  bin  hadoop fs -cat /data/jsondata_example.txt | head
{"tablename":"ydbexample","ydbpartion":"20151011","data":{"indexnum":0,"label":"l_0","userage":10,"clickcount":0,"paymoney":0,"price":0,"content":"0 0 0 延云 ydb 延云  测试  中文分词 中华人民共和国 沈阳延云云计算技术有限公司","contentcjk":"0 0 0 延云 ydb 延云  测试  中文分词 中华人民共和国 沈阳延云云计算技术有限公司"}}
{"tablename":"ydbexample","ydbpartion":"20151011","data":{"indexnum":1,"label":"l_1","userage":11,"clickcount":1,"paymoney":1.033,"price":1.03,"content":"1 1 1 延云 ydb 延云  测试  中文分词 中华人民共和国 沈阳延云云计算技术有限公司","contentcjk":"1 1 1 延云 ydb 延云  测试  中文分词 中华人民共和国 沈阳延云云计算技术有限公司"}}

如果按照官方的文档, 后面的路径是/data/myntest/jsondata_example.txt则先需要在hdfs创建/data/myntest

插入数据和查询数据

➜  bin  curl "http://localhost:8080/insert?taskid=0&hdfsfile=/data/jsondata_example.txt"
{"code":0,"list":["insert:/data/jsondata_example.txt,/data/ycloud/ydb/rawdata/0/jsondata_example.txt_20151130204532_1457527103"],"skip":[],"error":[]}

要过个一会儿,打开下面的链接(注意: 用curl貌似没反应):
http://localhost:8080/sql?sql=select indexnum,label from ydbexample where ydbpartion='20151222' limit 0,100

{
ydbau: 1,
code: 3,
timetaken: 17,
msg: "can not found ydppartion 20151222 in path /data/ycloud/ydb/ydbpath/index/ydbexample/index/20151222"
}

没有这个分区, 修改分区为20151011:
http://localhost:8080/sql?sql=select indexnum,label from ydbexample where ydbpartion='20151011' limit 0,100

ydb-6

其他查询

查询:
http://localhost:8080/sql?sql=select indexnum,label from ydbexample where ydbpartion='20151011' limit 0,100

排序:
http://localhost:8080/sql?sql=select indexnum,label from ydbexample where ydbpartion='20151011' order by indexnum desc limit 0,100

COUNT:
http://localhost:8080/sql?sql=select count(*),count(indexnum) from ydbexample where ydbpartion='20151011' limit 0,100

{
  ydbau: 1,
  code: 0,
  count: 1,
  doccount: 2500,
  timetaken: 116,
  list: [
    {
      stat: {
        count(*): "2500",
        count(indexnum): "2500"
      }
    }
  ]
}

统计:
http://localhost:8080/sql?sql=select sum(clickcount),avg(clickcount),max(clickcount),min(clickcount) from ydbexample where ydbpartion='20151011' limit 0,100

{
  ydbau: 1,
  code: 0,
  count: 1,
  doccount: 2500,
  timetaken: 54,
  list: [
    {
      stat: {
        sum(clickcount): "73350",
        avg(clickcount): "29.34",
        max(clickcount): "59",
        min(clickcount): "0"
      }
    }
  ]
}

收工

stop-dfs.sh
zkServer.sh stop
jps |grep YdbHttpServer|awk '{printf("%s\n",$1)}'|xargs kill
jps |grep Supervisor|awk '{printf("%s\n",$1)}'|xargs kill
jps |grep NimbusServer|awk '{printf("%s\n",$1)}'|xargs kill
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant