我们正在开发一个 Powershell 脚本,其中包括通过 REST API 执行多台计算机的作业导入。正常的作业导入也可以完美运行,并获取一个 XML,其中包含作为参数传递的所有必要信息。
现在我们想要并行化这个作业导入,这样这些导入中的几个可以同时发生,以减少大量计算机的导入时间。
为此,我们使用一个运行空间池并将一个工作程序(其中包含作业导入的代码)以及所有必要的参数传递给相应的 Powershell 实例。不幸的是,这似乎不起作用,因为即使在测量了导入时间之后,由于作业导入的并行化,我们也看不到任何加速。测量的时间总是与我们按顺序执行作业导入的时间大致相同——即没有并行化。
这是相关的代码片段:
function changeApplicationSequenceFromComputer {
param (
[Parameter(Mandatory=$True )]
[string]$tenant = $(throw "Parameter tenant is missing"),
[Parameter(Mandatory=$True)]
[string]$newSequenceName = $(throw "Parameter newSequenceName is missing")
)
# Other things before parallelization
# Passing all local functions and imported modules in runspace pool to call it from worker
$InitialSessionState = [initialsessionstate]::CreateDefault()
Get-ChildItem function:/ | Where-Object Source -like "" | ForEach-Object {
$functionDefinition = Get-Content "Function:\$($_.Name)"
$sessionStateFunction = New-Object System.Management.Automation.Runspaces.SessionStateFunctionEntry -ArgumentList $_.Name, $functionDefinition
$InitialSessionState.Commands.Add($sessionStateFunction)
}
# Using a synchronized Hashtable to pass necessary global variables for logging purpose
$Configuration = [hashtable]::Synchronized(@{})
$Configuration.ScriptPath = $global:ScriptPath
$Configuration.LogPath = $global:LogPath
$Configuration.LogFileName = $global:LogFileName
$InitialSessionState.ImportPSModule(@("$global:ScriptPath\lib\MigrationFuncLib.psm1"))
# Worker for parallelized job-import in for-each loop below
$Worker = {
param($currentComputerObjectTenant, $currentComputerObjectDisplayName, $newSequenceName, $Credentials, $Configuration)
$global:ScriptPath = $Configuration.ScriptPath
$global:LogPath = $Configuration.LogPath
$global:LogFileName = $Configuration.LogFileName
try {
# Function handleComputerSoftwareSequencesXml creates the xml that has to be uploaded for each computer
# We already tried to create the xml outside of the worker and pass it as an argument, so that the worker just imports it. Same result.
$importXml = handleComputerSoftwareSequencesXml -tenant $currentComputerObjectTenant -computerName $currentComputerObjectDisplayName -newSequence $newSequenceName -Credentials $Credentials
$Result = job-import $importXml -Server localhost -Credentials $Credentials
# sleep 1 just for testing purpose
Log "Result from Worker: $Result"
} catch {
$Result = $_.Exception.Message
}
}
# Preparatory work for parallelization
$cred = $Credentials
$MaxRunspacesProcessors = ($env:NUMBER_OF_PROCESSORS) * $multiplier # we tried it with just the number of processors as well as with a multiplied version.
Log "Number of Processors: $MaxRunspacesProcessors"
$RunspacePool = [runspacefactory]::CreateRunspacePool(1, $MaxRunspacesProcessors, $InitialSessionState, $Host)
$RunspacePool.Open()
$Jobs = New-Object System.Collections.ArrayList
foreach ($computer in $computerWithOldApplicationSequence) {
# Different things to do before parallelization, i.e. define some variables
# Parallelized job-import
Log "Creating or reusing runspace for computer '$currentComputerObjectDisplayName'"
$PowerShell = [powershell]::Create()
$PowerShell.RunspacePool = $RunspacePool
Log "Before worker"
$PowerShell.AddScript($Worker).AddArgument($currentComputerObjectTenant).AddArgument($currentComputerObjectDisplayName).AddArgument($newSequenceName).AddArgument($cred).AddArgument($Configuration) | Out-Null
Log "After worker"
$JobObj = New-Object -TypeName PSObject -Property @{
Runspace = $PowerShell.BeginInvoke()
PowerShell = $PowerShell
}
$Jobs.Add($JobObj) | Out-Null
# For logging in Worker
$JobIndex = $Jobs.IndexOf($JobObj)
Log "$($Jobs[$JobIndex].PowerShell.EndInvoke($Jobs[$JobIndex].Runspace))"
}
<#
while ($Jobs.Runspace.IsCompleted -contains $false) {
Log "Still running..."
Start-Sleep 1
}
#>
# Closing/Disposing pool
} # End of the function
脚本的其余部分如下所示(简化):
# Parameter passed when calling the script
param (
[Parameter(Mandatory=$True)]
[string]$newSequenceName = $(throw "Parameter target is missing"),
[Parameter(Mandatory=$True)]
[float]$multiplier= $(throw "Parameter multiplier is missing")
)
# 'main' block
$timeToRun = (Measure-Command{
changeApplicationSequenceFromComputer -tenant "testTenant" -newSequenceName $newSequenceName
}).TotalSeconds
Log "Total time to run with multiplier $($multiplier) is $timeToRun"
尽管运行空间池和相应的并行化,为什么作业导入显然只是按顺序执行的任何想法?