0

我们正在开发一个 Powershell 脚本,其中包括通过 REST API 执行多台计算机的作业导入。正常的作业导入也可以完美运行,并获取一个 XML,其中包含作为参数传递的所有必要信息。

现在我们想要并行化这个作业导入,这样这些导入中的几个可以同时发生,以减少大量计算机的导入时间。

为此,我们使用一个运行空间池并将一个工作程序(其中包含作业导入的代码)以及所有必要的参数传递给相应的 Powershell 实例。不幸的是,这似乎不起作用,因为即使在测量了导入时间之后,由于作业导入的并行化,我们也看不到任何加速。测量的时间总是与我们按顺序执行作业导入的时间大致相同——即没有并行化。

这是相关的代码片段:

function changeApplicationSequenceFromComputer {
param (
    [Parameter(Mandatory=$True )]
    [string]$tenant = $(throw "Parameter tenant is missing"),
    [Parameter(Mandatory=$True)]
    [string]$newSequenceName = $(throw "Parameter newSequenceName is missing")

)  

    # Other things before parallelization


    # Passing all local functions and imported modules in runspace pool to call it from worker
    $InitialSessionState = [initialsessionstate]::CreateDefault()
    Get-ChildItem function:/ | Where-Object Source -like "" | ForEach-Object {
    $functionDefinition = Get-Content "Function:\$($_.Name)"
    $sessionStateFunction = New-Object System.Management.Automation.Runspaces.SessionStateFunctionEntry -ArgumentList $_.Name, $functionDefinition 
    $InitialSessionState.Commands.Add($sessionStateFunction)
}
    # Using a synchronized Hashtable to pass necessary global variables for logging purpose
    $Configuration = [hashtable]::Synchronized(@{})
    $Configuration.ScriptPath = $global:ScriptPath
    $Configuration.LogPath = $global:LogPath
    $Configuration.LogFileName = $global:LogFileName
    
    $InitialSessionState.ImportPSModule(@("$global:ScriptPath\lib\MigrationFuncLib.psm1"))

    # Worker for parallelized job-import in for-each loop below
    $Worker = {
        param($currentComputerObjectTenant, $currentComputerObjectDisplayName, $newSequenceName, $Credentials, $Configuration)
        $global:ScriptPath = $Configuration.ScriptPath
        $global:LogPath = $Configuration.LogPath
        $global:LogFileName = $Configuration.LogFileName
        try { 
            # Function handleComputerSoftwareSequencesXml creates the xml that has to be uploaded for each computer
            # We already tried to create the xml outside of the worker and pass it as an argument, so that the worker just imports it. Same result.
            $importXml = handleComputerSoftwareSequencesXml -tenant $currentComputerObjectTenant -computerName $currentComputerObjectDisplayName -newSequence $newSequenceName -Credentials $Credentials
            $Result =  job-import $importXml -Server localhost -Credentials $Credentials 
            # sleep 1 just for testing purpose
            Log "Result from Worker: $Result"
        } catch {
            $Result = $_.Exception.Message
        }
    } 

    # Preparatory work for parallelization
    $cred = $Credentials
    $MaxRunspacesProcessors = ($env:NUMBER_OF_PROCESSORS) * $multiplier # we tried it with just the number of processors as well as with a multiplied version. 
    
    Log "Number of Processors: $MaxRunspacesProcessors"

    $RunspacePool = [runspacefactory]::CreateRunspacePool(1, $MaxRunspacesProcessors, $InitialSessionState, $Host) 
    $RunspacePool.Open()
    
    $Jobs = New-Object System.Collections.ArrayList

    foreach ($computer in $computerWithOldApplicationSequence) {

        # Different things to do before parallelization, i.e. define some variables 

        # Parallelized job-import
        
        Log "Creating or reusing runspace for computer '$currentComputerObjectDisplayName'"
        $PowerShell = [powershell]::Create() 
        $PowerShell.RunspacePool = $RunspacePool
        Log "Before worker"
        $PowerShell.AddScript($Worker).AddArgument($currentComputerObjectTenant).AddArgument($currentComputerObjectDisplayName).AddArgument($newSequenceName).AddArgument($cred).AddArgument($Configuration) | Out-Null
        Log "After worker"

        $JobObj = New-Object -TypeName PSObject -Property @{
            Runspace = $PowerShell.BeginInvoke()
            PowerShell = $PowerShell  
        }

        $Jobs.Add($JobObj) | Out-Null

        # For logging in Worker 
        $JobIndex = $Jobs.IndexOf($JobObj)
        Log "$($Jobs[$JobIndex].PowerShell.EndInvoke($Jobs[$JobIndex].Runspace))"

}
        <#
        while ($Jobs.Runspace.IsCompleted -contains $false) {
        Log "Still running..."
        Start-Sleep 1
        }
        #>
        # Closing/Disposing pool

} # End of the function

脚本的其余部分如下所示(简化):

# Parameter passed when calling the script
param (
    [Parameter(Mandatory=$True)]
    [string]$newSequenceName = $(throw "Parameter target is missing"),
    [Parameter(Mandatory=$True)]
    [float]$multiplier= $(throw "Parameter multiplier is missing")
)

# 'main' block

$timeToRun = (Measure-Command{
    
    changeApplicationSequenceFromComputer -tenant "testTenant" -newSequenceName $newSequenceName    

}).TotalSeconds


Log "Total time to run with multiplier $($multiplier) is  $timeToRun"

尽管运行空间池和相应的并行化,为什么作业导入显然只是按顺序执行的任何想法?

4

1 回答 1

0

我们发现了错误。foreach 包含以下代码块:

        # For logging in Worker 
        $JobIndex = $Jobs.IndexOf($JobObj)
        Log "$($Jobs[$JobIndex].PowerShell.EndInvoke($Jobs[$JobIndex].Runspace))"

这必须在 foreach 之外创建,以便代码如下所示:

    function changeApplicationSequenceFromComputer {
    param (
        [Parameter(Mandatory=$True )]
        [string]$tenant = $(throw "Parameter tenant is missing"),
        [Parameter(Mandatory=$True)]
        [string]$newSequenceName = $(throw "Parameter newSequenceName is missing")
    
    )  
    
    # ... Everything as before
    
    $Jobs.Add($JobObj) | Out-Null
    
    } #end of foreach

$Results = @()
foreach($Job in $Jobs ){   
    $Results += $Job.PowerShell.EndInvoke($Job.Runspace)    
}

因此必须在 foreach 之外调用 EndInvoke()。

于 2021-05-27T14:33:20.130 回答