1

我正在将 csv 文件复制到 cassandra。我有下面的 csv 文件,表的创建如下。

CREATE TABLE UCBAdmissions(
id int PRIMARY KEY,
admit text,
dept text,
freq int,
gender text

  1. 当我使用

    从 'UCBAdmissions.csv' 复制 UCBAdmissions,DELIMITER = ',' AND HEADER = TRUE;

输出是
0.318 秒内导入的 24 行。
cqlsh> select *from UCBAdmissions;

编号 | 承认 | 部门 | 频率 | 性别
----+-------+------+------+--------

(0 行)

  1. 复制 UCBAdmissions(id,admit,gender, dept , freq ) from 'UCBAdmissions.csv' WITH DELIMITER = ',' AND HEADER = TRUE;

输出是
在 0.364 秒内导入的 24 行。
cqlsh> select *from UCBAdmissions;

编号 | 承认 | 部门 | 频率 | 性别
----+----------+------+------+--------
23 | 录取 | F | 24 | 女
5 | 录取 | 乙| 353 | 男
10 | 拒绝 | C | 205 | 男
16 | 拒绝 | D | 244 | 女
13 | 录取 | D | 138 | 男
11 | 录取 | C | 202 | 女
1 | 录取 | 一个 | 512 | 男
19 | 录取 | E | 94 | 女
8 | 拒绝 | 乙| 8 | 女
2 | 拒绝 | 一个 | 313 | 男
4 | 拒绝 | 一个 | 19 | 女
18 | 拒绝 | E | 138 | 男
15 | 录取 | D | 131 | 女
22 | 拒绝 | F | 351 | 男性
20 | 拒绝 | E | 299 | 女
7 | 录取 | 乙| 17 | 女
6 | 拒绝 | 乙| 207 | 男
9 | 录取 | C | 120 | 男
14 | 拒绝 | D | 279 | 男
21 | 录取 | F | 22 | 男
17 | 录取 | E | 53 | 男
24 | 拒绝 | F | 317 | 女
12 | 拒绝 | C | 391 | 女
3 | 录取 | 一个 | 89 | 女性

UCBAdmissions.csv

"","录取","性别","部门","频率"
"1","录取","男","A",512
"2","拒绝","男","A" ,313
"3","录取","女","A",89
"4","拒绝","女","A",19
"5","录取","男","B ",353
"6","拒绝","男性","B",207
"7","录取","女性","B",17
"8","拒绝","女性"," B",8
"9","录取","男","C",120
"10","拒绝","男性","C",205
"11","录取","女性","C",202
"12","拒绝","女性","C",391
"13", "录取","男","D",138
"14","拒绝","男","D",279
"15","录取","女","D",131
"16" ,"拒绝","女性","D",244
"17","录取","男性","E",53
"18","拒绝","男性","E",138202 “12”,“拒绝”,“女性”,“C”,391 “13”,“录取”,“男性”,“D”,138 “14”,“拒绝”,“男性”,“D” ,279 "15","录取","女性","D",131 "16","拒绝","女性","D",244 "17","录取","男性","E ",53 "18","拒绝","男","E",138202 “12”,“拒绝”,“女性”,“C”,391 “13”,“录取”,“男性”,“D”,138 “14”,“拒绝”,“男性”,“D” ,279 "15","录取","女性","D",131 "16","拒绝","女性","D",244 "17","录取","男性","E ",53 "18","拒绝","男","E",13815","录取","女","D",131 "16","拒绝","女","D",244 "17","录取","男","E",53 "18","拒绝","男","E",13815","录取","女","D",131 "16","拒绝","女","D",244 "17","录取","男","E",53 "18","拒绝","男","E",138
"19","录取","女","E",94
"20","拒绝","女","E",299
"21","录取","男","F", 22
“22”,“拒绝”,“男性”,“F”,351
“23”,“录取”,“女性”,“F”,24
“24”,“拒绝”,“女性”,“F” ,317

如上所示,我看到 csv 文件的输出顺序发生了变化。
问:1和2有什么区别?我们应该按照与 csv 文件相同的顺序在 cassandra 中创建表吗?

4

1 回答 1

0

Cassandra被设计为分布式的——为了实现这一点,它使用表的分区键id(Murmur3Partitioner环中的一个节点。

您看到的是按结果标记排序的结果,这是不直观的,但不一定是错误的。在 Cassandra中没有直接的方法来做 a SELECT * FROM table ORDER BY primaryKey ASC- 分布式特性使得它很难有效地做到。

于 2015-12-16T06:03:44.460 回答