可以对原始类型数组进行特征化。
如果你的班级看起来像这样:
class Policy
{
string Name { get; set; }
DateTime InceptionDate { get; set; }
DateTime ExpirationDate { get; set; }
float[] Locations { get; set; }
}
然后Locations
将转换为Vector
类型R4
(映射为float
)。
然后你创建一个SchemaDefinition
:
var env = new LocalEnvironment();
var schemaDef = SchemaDefinition.Create(typeof(Policy));
如果向量的大小在编译时未知,您还需要:
int vectorSize = 4
schemaDef["Locations"].ColumnType = new VectorType(NumberType.R4, vectorSize);
如果向量的大小是固定的,您可以在VectorType
属性上添加属性:
class Policy
{
string Name { get; set; }
DateTime InceptionDate { get; set; }
DateTime ExpirationDate { get; set; }
[VectorType(4)]
float[] Locations { get; set; }
}
然后创建DataView
:
var data = new List<Policy>();
var dataView = env.CreateStreamingDataView(data, schemaDef);
在您的情况下,Locations
是一个类,所以我相信您首先需要通过连接本示例中的值将其转换为原始数组:
public class IrisData
{
public float Label;
public float SepalLength;
public float SepalWidth;
public float PetalLength;
public float PetalWidth;
}
public class IrisVectorData
{
public float Label;
public float[] Features;
}
static void Main(string[] args)
{
// Here's a data array that we want to work on.
var dataArray = new[] {
new IrisData{Label=1, PetalLength=1, SepalLength=1, PetalWidth=1, SepalWidth=1},
new IrisData{Label=0, PetalLength=2, SepalLength=2, PetalWidth=2, SepalWidth=2}
};
// Create the ML.NET environment.
var env = new Microsoft.ML.Runtime.Data.TlcEnvironment();
// Create the data view.
// This method will use the definition of IrisData to understand what columns there are in the
// data view.
var dv = env.CreateDataView<IrisData>(dataArray);
// Now let's do something to the data view. For example, concatenate all four non-label columns
// into 'Features' column.
dv = new Microsoft.ML.Runtime.Data.ConcatTransform(env, dv, "Features",
"SepalLength", "SepalWidth", "PetalLength", "PetalWidth");
// Read the data into an another array, this time we read the 'Features' and 'Label' columns
// of the data, and ignore the rest.
// This method will use the definition of IrisVectorData to understand which columns and of which types
// are expected to be present in the input data.
var arr = dv.AsEnumerable<IrisVectorData>(env, reuseRowObject: false)
.ToArray();
}
但是我还没有真正尝试过这个案例,所以我在这里无法提供更多帮助。
另请在此处查看模式理解文档