6.1 定位运算
6.1.1 成员定位
1. 获取 5 在序列中的位置
2. 判断 5 是否在序列中
SPL
A | B | |
---|---|---|
1 | [2,9,7,7,2,2,9,3,3,5] | |
2 | =A1.pos(5) | /10 |
3 | =A1.pos(5)>0 | /true |
SQL
1. 获取 5 在序列中的位置
SELECT position FROM (
SELECT COLUMN_VALUE AS num, ROWNUM position
FROM TABLE(SYS.ODCINUMBERLIST(2,9,7,7,2,2,9,3,3,5)))
WHERE num = 5;
2. 判断 5 是否在序列中
SELECT SIGN(COUNT(*)) AS result
FROM TABLE(SYS.ODCINUMBERLIST(2,9,7,7,2,2,9,3,3,5))
WHERE COLUMN_VALUE = 5;
Python
sequence = np.array([2, 9, 7, 7, 2, 2, 9, 3, 3, 5])
indices = np.where(sequence == 5)[0] #9
is_present = len(np.where(sequence == 5)[0])!=0 #True
6.1.2 分段定位
定位指定日期属于第几段
SPL
A | B | |
---|---|---|
1 | [1970-01-01,1980-01-01,1990-01-01,2000-01-01,2010-01-01,2020-01-01] | |
2 | 1965-02-03 | |
3 | 1979-12-31 | |
4 | 1980-01-01 | |
5 | 2005-08-09 | |
6 | 2023-03-23 | |
7 | =A1.pseg(A2) | /0 |
8 | =A1.pseg(A3) | /1 |
9 | =A1.pseg(A4) | /2 |
10 | =A1.pseg(A5) | /4 |
11 | =A1.pseg(A6) | /6 |
12 | =[A2:A6].(A1.pseg(~)) | /[0,1,2,4,6] |
pos()和 pseg()都不是循环函数,所以本例中的 A12 中的 ~ 是 [A2:A6] 中的成员。
SQL
SQL 没有提供特别便捷的方法。
Python
from datetime import datetime
import bisect
date_list = ["1970-01-01","1980-01-01","1990-01-01","2000-01-01","2010-01-01","2020-01-01"]
date_list = [datetime.strptime(date, "%Y-%m-%d") for date in date_list]
date1 = datetime.strptime("1965-02-03", "%Y-%m-%d")
index1 = bisect.bisect_right(date_list, date1) #0
date2 = datetime.strptime("1979-12-31", "%Y-%m-%d")
index2 = bisect.bisect_right(date_list, date2) #1
date3 = datetime.strptime("1980-01-01", "%Y-%m-%d")
index3 = bisect.bisect_right(date_list, date3) #2
date4 = datetime.strptime("2005-08-09", "%Y-%m-%d")
index4 = bisect.bisect_right(date_list, date4) #4
date5 = datetime.strptime("2023-03-23", "%Y-%m-%d")
index5 = bisect.bisect_right(date_list, date5) #6
6.1.3 条件定位
查看销售部门员工在员工表中的位置
SPL
A | |
---|---|
1 | =file(“EMPLOYEE.csv”).import@tc() |
2 | =A1.pselect@a(DEPT==“Sales”) |
pselect() 只选出第一个满足条件的成员位置,@a 选项会返回所有满足条件的成员位置。
SQL
SELECT rn FROM (
SELECT EMPLOYEE.*, ROWNUM AS rn FROM EMPLOYEE)
WHERE DEPT = 'Sales';
Python
df = pd.read_csv('../EMPLOYEE.csv')
sales_employees_index = np.where(df['DEPT'] == 'Sales')[0]
6.1.4 普通聚合定位
查看工资最高员工在员工表中的位置
SPL
A | B | |
---|---|---|
1 | =file(“EMPLOYEE.csv”).import@tc() | |
2 | =A1.pmax(SALARY) | /1 个成员的位置 |
3 | =A1.pmax@a(SALARY) | / 全部成员的位置 |
SQL
1. 1 个成员的位置
SELECT rn FROM (
SELECT EMPLOYEE.*, ROWNUM AS rn
FROM EMPLOYEE
ORDER BY SALARY DESC)
FETCH FIRST 1 ROWS ONLY;
2. 全部成员的位置
SELECT rn FROM (
SELECT EMPLOYEE.*, ROWNUM AS rn
FROM EMPLOYEE)
WHERE SALARY=(SELECT MAX(SALARY) FROM EMPLOYEE);
Python
df = pd.read_csv('../EMPLOYEE.csv')
#1个成员的位置
highest_salary_employee_index_first = np.argmax(df['SALARY'])
#全部成员的位置
highest_salary_employee_index_all = np.where(df['SALARY'] == df['SALARY'].max())[0]
Python 选出 1 个成员位置和多个成员位置是完全不同的方法,选出多个位置时需要遍历两遍,一遍是计算最大值,另一遍是找到索引。
6.1.5 集合聚合定位
查看工资最高的 10 个员工在员工表中的位置
SPL
A | |
---|---|
1 | =file(“EMPLOYEE.csv”).import@tc() |
2 | =A1.ptop(-10,SALARY) |
SQL
SELECT rn FROM (
SELECT EMPLOYEE.*, ROWNUM AS rn
FROM EMPLOYEE
ORDER BY SALARY DESC)
FETCH FIRST 10 ROWS ONLY;
Python
df = pd.read_csv('../EMPLOYEE.csv')
top_10_positions = np.argsort(-df['SALARY'])[:10]
Pandas 是排序后取前 10 的方法。