我认为您需要在这里使用INNER JOINwith a DISTINCT:
SELECT distinct uns.*
FROM uniquestructures as uns
INNER JOIN uniqueproteins as unp on uns.ProteinID = unp.ProteinId
where LENGTH(unp.PDBASequence) < 20;
此外,如果您在表格上创建一个单独的列uniqueproteins来保存列的长度PDBASequence(例如PDBASequenceLength),您可能会感到高兴。然后,您可以在列上放置索引,PDBASequenceLength而不是LENGTH(PDBASequence)在查询中调用。如果数据不是静态的,则创建一个触发器以在PDBASequenceLength每次将行插入或更新到uniqueproteins表中时填充列。因此:
CREATE TRIGGER uniqueproteins_length_insert_trg
AFTER INSERT ON uniqueproteins FOR EACH ROW SET NEW.PDBASequenceLength = length(new.PDBASequence);
CREATE TRIGGER uniqueproteins_length_update_trg
AFTER UPDATE ON uniqueproteins FOR EACH ROW SET NEW.PDBASequenceLength = length(new.PDBASequence);
alter table uniqueproteins add key `uniqueproteinsIdx2` (PDBASequenceLength);
您的查询可能是:
SELECT uns.*
FROM uniquestructures as uns
INNER JOIN uniqueproteins as unp on uns.ProteinID = unp.ProteinId
where unp.PDBASequenceLength < 20;
祝你好运!