3

尽管找到了例子,但我在这方面遇到了一些麻烦。我认为这可能是编码问题,但我不确定。我正在尝试以编程方式从使用 cookie 的 https 服务器下载文件(因此我使用的是 httpwebrequest)。我正在调试打印要检查的流的容量,但输出 [原始] 文件看起来不同。尝试了其他编码无济于事。

代码:

    Sub downloadzip(strURL As String, strDestDir As String)

    Dim request As HttpWebRequest
    Dim response As HttpWebResponse

    request = Net.HttpWebRequest.Create(strURL)
    request.UserAgent = strUserAgent
    request.Method = "GET"
    request.CookieContainer = cookieJar
    response = request.GetResponse()

    If response.ContentType = "application/zip" Then
        Debug.WriteLine("Is Zip")
    Else
        Debug.WriteLine("Is NOT Zip: is " + response.ContentType.ToString)
        Exit Sub
    End If

    Dim intLen As Int64 = response.ContentLength
    Debug.WriteLine("response length: " + intLen.ToString)

    Using srStreamRemote As StreamReader = New StreamReader(response.GetResponseStream(), Encoding.Default)
        'Using ms As New MemoryStream(intLen)
        Dim fullfile As String = srStreamRemote.ReadToEnd

        Dim memstream As MemoryStream = New MemoryStream(New UnicodeEncoding().GetBytes(fullfile))

        'test write out to flie
        Dim data As Byte() = memstream.ToArray()
        Using filestrm As FileStream = New FileStream("c:\temp\debug.zip", FileMode.Create)
            filestrm.Write(data, 0, data.Length)
        End Using

        Debug.WriteLine("Memstream capacity " + memstream.Capacity.ToString)
        'Dim strData As String = srStreamRemote.ReadToEnd
        memstream.Seek(0, 0)
        Dim buffer As Byte() = New Byte(2048) {}
        Using zip As New ZipInputStream(memstream)
            Debug.WriteLine("zip stream cap " + zip.Length.ToString)
            zip.Seek(0, 0)
            Dim e As ZipEntry

            Dim flag As Boolean = True
            Do While flag ' daft, but won't assign e=zip... tries to evaluate
                e = zip.GetNextEntry
                If IsNothing(e) Then
                    flag = False
                    Exit Do
                Else
                    e.UseUnicodeAsNecessary = True
                End If

                If Not e.IsDirectory Then
                    Debug.WriteLine("Writing out " + e.FileName)
                    '    e.Extract(strDestDir)

                    Using output As FileStream = File.Open(Path.Combine(strDestDir, e.FileName), _
                                                          FileMode.Create, FileAccess.ReadWrite)
                        Dim n As Integer
                        Do While (n = zip.Read(buffer, 0, buffer.Length) > 0)
                            output.Write(buffer, 0, n)
                        Loop
                    End Using

                End If
            Loop
        End Using
        'End Using
    End Using 'srStreamRemote.Close()
    response.Close()
End Sub

所以我下载了正确大小的文件,但 dotnetzip 无法识别它,并且复制出来的文件是不完整/无效的 zip。我今天大部分时间都花在这上面,准备放弃了。

4

3 回答 3

4

我认为答案将是分解问题,并可能更改代码中的几个方面。

例如,让我们摆脱将响应流转换为字符串:

Dim memStream As MemoryStream
Using rdr As System.IO.Stream = response.GetResponseStream
    Dim count = Convert.ToInt32(response.ContentLength)
    Dim buffer = New Byte(count) {}
    Dim bytesRead As Integer
    Do
        bytesRead += rdr.Read(buffer, bytesRead, count - bytesRead)
    Loop Until bytesRead = count
    rdr.Close()
    memStream = New MemoryStream(buffer)
End Using

接下来,有一种更简单的方法可以将内存流的内容输出到文件中。考虑你的代码

Dim data As Byte() = memstream.ToArray()
Using filestrm As FileStream = New FileStream("c:\temp\debug.zip", FileMode.Create)
    filestrm.Write(data, 0, data.Length)
End Using

可以替换为

Using filestrm As FileStream = New FileStream("c:\temp\debug.zip", FileMode.Create)
    memstream.WriteTo(filestrm)
End Using

这消除了将内存流传输到另一个字节数组,然后将字节数组向下推到流中的需要,而实际上内存流可以将数据直接传输到文件(通过文件流),从而节省了中间人缓冲区。

我承认我没有使用过您正在使用的 Zip/压缩库,但是通过上述修改,您已经删除了流、字节数组、字符串等之间不必要的传输,并希望消除了您遇到的编码问题。

试一试,让我们知道你的进展情况。考虑尝试打开您保存的文件(“C:\temp\debug.zip”)以查看它是否被列为损坏。如果不是,那么您至少就代码中的内容而言,它工作正常。

于 2011-07-07T15:50:10.207 回答
2

我想我会针对我自己的问题发布完整的工作解决方案,它结合了我得到的两个出色的答复,谢谢你们。

Sub downloadzip(strURL As String, strDestDir As String)
    Try

        Dim request As HttpWebRequest
        Dim response As HttpWebResponse

        request = Net.HttpWebRequest.Create(strURL)
        request.UserAgent = strUserAgent
        request.Method = "GET"
        request.CookieContainer = cookieJar
        response = request.GetResponse()

        If response.ContentType = "application/zip" Then
            Debug.WriteLine("Is Zip")
        Else
            Debug.WriteLine("Is NOT Zip: is " + response.ContentType.ToString)
            Exit Sub
        End If

        Dim intLen As Int32 = response.ContentLength
        Debug.WriteLine("response length: " + intLen.ToString)

        Dim memStream As MemoryStream
        Using stmResponse As IO.Stream = response.GetResponseStream()
            'Using ms As New MemoryStream(intLen)

            Dim buffer = New Byte(intLen) {}
            'Dim memstream As MemoryStream = New MemoryStream(buffer)

            Dim bytesRead As Integer
            Do
                bytesRead += stmResponse.Read(buffer, bytesRead, intLen - bytesRead)
            Loop Until bytesRead = intLen

            memStream = New MemoryStream(buffer)

            Dim res As Boolean = False
            res = ZipExtracttoFile(memStream, strDestDir)

        End Using 'srStreamRemote.Close()
        response.Close()



    Catch ex As Exception
        'to do :)
    End Try
End Sub


Function ZipExtracttoFile(strm As MemoryStream, strDestDir As String) As Boolean

    Try
        Using zip As ZipFile = ZipFile.Read(strm)
            For Each e As ZipEntry In zip

                e.Extract(strDestDir)

            Next
        End Using
    Catch ex As Exception
        Return False
    End Try

    Return True

End Function
于 2011-07-15T10:55:11.277 回答
1

您可以下载到 MemoryStream 中,然后对其进行检查:

Public Sub Download(url as String)
    Dim req As HttpWebRequest = System.Net.WebRequest.Create(url)
    req.Method = "GET"
    Dim resp As HttpWebResponse = req.GetResponse()
    If resp.ContentType = "application/zip" Then
        Console.Error.Write("The result is a zip file.")
        Dim length As Int64 = resp.ContentLength
        If length = -1 Then
            Console.Error.WriteLine("... length unspecified")
            length = 16 * 1024
        Else
            Console.Error.WriteLine("... has length {0}", length)
        End If
        Dim ms As New MemoryStream
        CopyStream(resp.GetResponseStream(), ms)  '' **see note below!!!!
        '' list contents of the zip file
        ms.Seek(0,SeekOrigin.Begin)
        Using zip As ZipFile = ZipFile.Read (ms)
            Dim e As ZipEntry
            Console.Error.WriteLine("Entries:")
            Console.Error.WriteLine("  {0,22}  {1,10}  {2,12}", _
                                    "Name", "compressed", "uncompressed")
            Console.Error.WriteLine("----------------------------------------------------")
            For Each e In zip
                Console.Error.WriteLine("  {0,22}  {1,10}  {2,12}", _
                                        e.FileName, _
                                        e.CompressedSize, _
                                        e.UncompressedSize)
            Next
        End Using
    Else
        Console.Error.WriteLine("The result is Not a zip file.")
        CopyStream(resp.GetResponseStream(), Console.OpenStandardOutput)
    End If
End Sub


Private Shared Sub CopyStream(input As Stream, output As Stream)
    Dim buffer(32768 - 1) As Byte
    Dim n As Int32
    Do
        n = input.Read(buffer, 0, buffer.Length)
        If n = 0 Then Exit Do
            output.Write(buffer, 0, n)
    Loop
End Sub

编辑

请注意 - 如果 Zip 文件非常大,我不建议使用此代码(这种方法)。“非常大”有多大?当然,这取决于。我上面建议的代码将文件下载到内存流中,这当然意味着 zip 文件的全部内容都保存在内存中。如果是 28kb 的 zip 文件,则没有问题。但如果它是一个 2gb 的 zip 文件,那么你可能会遇到很大的问题。

在这种情况下,您将希望将其流式传输到磁盘上的临时文件,而不是 MemoryStream。我将把它作为练习留给读者。

以上内容适用于“大小合理”的 zip 文件,其中“合理”取决于您的机器配置和应用场景。

于 2011-07-13T17:38:41.977 回答