Exchange 2013: When the DAG NIC Affects Transport, That Can’t Be Good

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

In my on-premises Exchange 2013 lab, there was a queue built up to a mailbox:

 

Get-TransportService | Get-Queue | Where{$_.MessageCount -gt 0} | FT -AutoSize

 

Identity      DeliveryType          Status MessageCount Velocity RiskLevel OutboundIPPool NextHopDomain

——–      ————          —— ———— ——– ——— ————– ————-

Stromkarlen\3 SmtpDeliveryToMailbox Retry  15           0        Normal    0              emea-sweex15-01 store 001

 

What’s interesting about the queue is that it’s specifically delivery to a mailbox but the error associated is a DNS error:

 

Get-Queue “Stromkarlen\3” | Select -ExpandProperty LastError -Unique

451 4.4.0 DNS query failed. The error was: SMTPSEND.DNS.NonExistentDomain; nonexistent domain

 

If I looked at the recipient, we’re targeting the correct store:

 

Get-Message -Queue “Stromkarlen\3” -IncludeRecipientInfo | Select -ExpandProperty Recipients -Unique

 

Address          : [email protected]

OutboundIPPool   : 0

Type             : Mailbox

FinalDestination : CN=EMEA-SWEEX15-01 Store 001,CN=Databases,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=minvan,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=minvan,DC=se

Status           : Ready

LastError        : [{LRT=};{LED=400 4.4.7 Message delayed};{FQDN=};{IP=}]

 

The store was mounted and last log was ~14 minutes ago (from when I was investigating):

 

Get-MailboxDatabaseCopyStatus “EMEA-SWEEX15-01 Store 001” | FT -AutoSize

 

Name                                  Status  CopyQueueLength ReplayQueueLength LastInspectedLogTime ContentIndexState

—-                                  ——  ————— —————– ——————– —————–

EMEA-SWEEX15-01 Store 001\HYLDA       Healthy 0               0                 8/21/2014 7:25:24 PM Healthy

EMEA-SWEEX15-01 Store 001\STROMKARLEN Mounted 0               0                                      Healthy

 

The address we were sending to is correct:

 

Get-Mailbox Administrator | FL *addr*

 

 

AddressBookPolicy             :

ForwardingAddress             :

ForwardingSmtpAddress         :

OfflineAddressBook            :

JournalArchiveAddress         :

GeneratedOfflineAddressBooks  : {}

AddressListMembership         : {\Mailboxes(VLV), \All Mailboxes(VLV), \All Recipients(VLV), \Default Global Address List, \All Users}

EmailAddresses                : {smtp:[email protected], SMTP:[email protected]}

HiddenFromAddressListsEnabled : False

EmailAddressPolicyEnabled     : True

PrimarySmtpAddress            : [email protected]

WindowsEmailAddress           : [email protected]

 

In the connectivity log (Mailbox), I see the following:

 

2014-08-21T19:40:58.639Z,08D187F0D08C4E37,MapiDelivery,EMEA-SWEEX15-01 Store 002,>,Starting delivery

2014-08-21T19:40:58.639Z,08D187F0D08C4E37,MapiDelivery,EMEA-SWEEX15-01 Store 002,>,Connecting to server Stromkarlen.minvan.se session type Mailbox

2014-08-21T19:40:58.685Z,08D187F0D08C4E37,MapiDelivery,EMEA-SWEEX15-01 Store 002,-,Messages: 1 Bytes: 190 Recipients: 1

2014-08-21T19:40:58.701Z,08D187F0D08C4E39,MapiDelivery,EMEA-SWEEX15-01 Store 001,+,Delivery;MailboxServer=Stromkarlen.minvan.se;Database=EMEA-SWEEX15-01 Store 001

2014-08-21T19:40:58.701Z,08D187F0D08C4E39,MapiDelivery,EMEA-SWEEX15-01 Store 001,>,Starting delivery

2014-08-21T19:40:58.717Z,08D187F0D08C4E39,MapiDelivery,EMEA-SWEEX15-01 Store 001,>,Connecting to server Stromkarlen.minvan.se session type Mailbox

2014-08-21T19:40:58.763Z,08D187F0D08C4E39,MapiDelivery,EMEA-SWEEX15-01 Store 001,-,Messages: 1 Bytes: 190 Recipients: 1

2014-08-21T19:42:59.453Z,08D187F0D08C4E3B,MapiDelivery,EMEA-SWEEX15-01 Store 002,+,Delivery;MailboxServer=Stromkarlen.minvan.se;Database=EMEA-SWEEX15-01 Store 002

2014-08-21T19:42:59.453Z,08D187F0D08C4E3B,MapiDelivery,EMEA-SWEEX15-01 Store 002,>,Starting delivery

2014-08-21T19:42:59.469Z,08D187F0D08C4E3B,MapiDelivery,EMEA-SWEEX15-01 Store 002,>,Connecting to server Stromkarlen.minvan.se session type Mailbox

2014-08-21T19:42:59.500Z,08D187F0D08C4E3B,MapiDelivery,EMEA-SWEEX15-01 Store 002,-,Messages: 1 Bytes: 190 Recipients: 1

2014-08-21T19:42:59.531Z,08D187F0D08C4E3D,MapiDelivery,EMEA-SWEEX15-01 Store 001,+,Delivery;MailboxServer=Stromkarlen.minvan.se;Database=EMEA-SWEEX15-01 Store 001

2014-08-21T19:42:59.531Z,08D187F0D08C4E3D,MapiDelivery,EMEA-SWEEX15-01 Store 001,>,Starting delivery

2014-08-21T19:42:59.531Z,08D187F0D08C4E3D,MapiDelivery,EMEA-SWEEX15-01 Store 001,>,Connecting to server Stromkarlen.minvan.se session type Mailbox

2014-08-21T19:42:59.563Z,08D187F0D08C4E3D,MapiDelivery,EMEA-SWEEX15-01 Store 001,-,Messages: 1 Bytes: 190 Recipients: 1

 

But Submission gave me the juicy details of what our issue was stemming from:

 

2014-08-21T19:43:54.533Z,08D187F052C39BA0,MapiSubmission,bbbfef75-3381-42dc-b027-850f29eebc62,+,STROMKARLEN.minvan.se
2014-08-21T19:43:54.533Z,08D187F052C39BA1,SMTP,mailboxtransportsubmissioninternalproxy,+,Undefined 00000000-0000-0000-0000-000000000000;QueueLength=<no priority counts>
2014-08-21T19:43:54.533Z,08D187F052C39BA1,SMTP,mailboxtransportsubmissioninternalproxy,>,Non-existent domain reported by y.y.y.y. [Domain:Result] = Hylda.minvan.se:InfoDomainNonexistent; STROMKARLEN.minvan.se:InfoDomainNonexistent;
2014-08-21T19:43:54.533Z,08D187F052C39BA1,SMTP,mailboxtransportsubmissioninternalproxy,-,Messages: 0 Bytes: 0 (The DNS query for  ‘Undefined’:’mailboxtransportsubmissioninternalproxy’:’00000000-0000-0000-0000-000000000000′ failed with error : InfoDomainNonexistent)
2014-08-21T19:43:54.533Z,08D187F052C39BA0,MapiSubmission,bbbfef75-3381-42dc-b027-850f29eebc62,>,”Failed; HResult: 2684354560; DiagnosticInfo: Stage:CommitMailItem, SmtpResponse:451 4.4.0 DNS query failed. The error was: SMTPSEND.DNS.NonExistentDomain; nonexistent domain”
2014-08-21T19:43:54.533Z,08D187F052C39BA0,MapiSubmission,bbbfef75-3381-42dc-b027-850f29eebc62,-,RegularSubmissions: 0 ShadowSubmissions: 0 Bytes: 0 Recipients: 0 Failures: 1 ReachedLimit: False Idle: False

 

Then, I saw my answer. We’re hitting the wrong DNS server for the query:

 

nslookup hylda.minvan.se

Server:  UnKnown

Address:  x.x.x.x

 

Name:    hylda.minvan.se

Address:  x.x.x.x

 

Changed the DNS settings on the DAG NIC and resubmitted the queue:

 

Retry-Queue “Stromkarlen\3” -Resubmit:$TRUE

 

Get-TransportService | Get-Queue | Where{$_.MessageCount -gt 0} | FT -AutoSize

 

Identity      DeliveryType          Status MessageCount Velocity RiskLevel OutboundIPPool NextHopDomain

——–      ————          —— ———— ——– ——— ————– ————-

Stromkarlen\3 SmtpDeliveryToMailbox Active 15           0.02     Normal    0              emea-sweex15-01 store 001

 

 

Get-TransportService | Get-Queue | Where{$_.MessageCount -gt 0} | FT -AutoSize

 

Essentially, this boiled down to two NICS using different DNS servers but why didn’t this affect the SMTP probes to Health Mailboxes? If it didn’t, why did we use local DNS for Health Manager and foreign DNS for delivery? …and, if that’s the case. Is this a bug?

 

Then, the answer finally arrived in the form of a realization: if we had no queues to the Health Mailboxes, maybe the probe was broken? Alright, time to investigate that piece.

 

Sure enough, the probe has failed some time back and was marked as poisoned:

 

   MachineName: Stromkarlen.minvan.se

 

StartTime            ProbeName                      Result       Duration   Latency    TargetResource/Error                                         HealthSet

———            ———                      ——       ——–   ——-    ——————–                                         ———

2014-08-21 20:12:57  Mapi.Submit.Probe              Poisoned     02:28.388  00:00.000  A previous poisoned execution was detected. This is indic… MailboxTransport

2014-08-21 20:12:59  MailboxDeliveryAvailability    Poisoned     03:38.978  00:00.000  A previous poisoned execution was detected. This is indic… MailboxTransport

2014-08-21 20:13:04  MailboxTransportSubmissionS… Poisoned     02:49.648  00:00.000  {msexchangesubmission, A previous poisoned execution was … MailboxTransport

2014-08-21 20:13:20  MailboxTransportDeliverySer… Poisoned     02:30.491  00:00.000  {msexchangedelivery, A previous poisoned execution was de… MailboxTransport

 

Moral of the story: Logs are just as important as the Health Manager/Active Probing, as the probes could fail or be marked as poisoned.

O365: CDN Change Causes OWA Client Error

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

Recently, we’ve seen a pattern of escalations wherein users are no longer able to access OWA. Specifically, the error will be similar to the following:

In the details, we’ll see the error we’re concerned with:

X-OQA-Error: ClientError;exMsg=’_g’ is undefined; file=https://pod51048.outlook.com/owa/:362

If we use Network Tracing (F12 in Internet Explorer) [or Fiddler] we’ll see a failure to connect to the CDN:

The connection to ‘r1.res.office365.com‘ failed. <br />Error: ConnectionRefused (0x274d). <br />System.Net.Sockets.SocketException No connection could be made because the target machine actively refused it 23.221.8.9:443

Using ARIN, we can check to make sure that the CDN (Content Delivery Network) we’re trying to hit is owned by Akamai.

The return we’re receiving, highlighted, is because the client cannot load ‘boot.worldwide.0.mouse.js’ from the CDN. This is evidenced by the refusal to connect to the RES server, highlighted.

The root cause of this issue is the configuration of the customer-owned firewall. This is evidenced by the client’s inability to connect to specific CDN and other users are able to use OWA sans issue; notably, because they are connecting to another CDN (‘r4.res.office365.com’, for example).

Since the CDN is controlled by a third-party and we have no way to predict which hosts will be up/in use/referred, our recommendation is to use URL-based filtering. You can read the EHLO blog post (written by our very own Tim Heeney) on the recommendation for URL-based filtering, here: http://blogs.technet.com/b/exchange/archive/2013/12/02/office-365-url-based-filtering-is-just-better-and-easier-to-sustain.aspx

Exchange 2013: Recipient Filtering Agent (More Lessons in Proxy)

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

With the arrival of Exchange 2013 came some design changes and, with those changes, inevitably something was bound to catch us off-guard. To help explain this better, enter the proxy from the CAS Frontend to the Mailbox backend.

All SMTP connections (we’re focusing on SMTP for this post) utilize the proxy of the Frontend Transport to make an SMTP connection. Even on multi-role servers, the incoming request on 25 utilizes the proxy to 2525 for Transport.

Here is a screenshot over-view of Transport from a previous EHLO blog:

The introduction of protocol proxy isn’t limited to SMTP but we’re focusing on SMTP, as it’s relevant to this post.

 

With the introduction of protocol proxy, there was an introduction of an issue with Transport Agents. Namely, Transport Agents that were installed on a Frontend server (CAS) were actually installed on a backend server. To get around this issue, one had to use local PowerShell and add the Exchange PowerShell Snap-in. However, what wasn’t consider is the following:

 

An Exchange Administrator installs the Transport Agents via PowerShell on the backend server and enables the Recipient Filtering Agent. [This introduces our problem.]

 

User A, external of the organization, composes a message to User B and an invalid recipient, User C, who are internal of the organization. The invalid recipient source isn’t as important as the fact that the invalid recipient exists in the header. When the external mail server connects to the on-premises Frontend, the typical SMTP traffic occurs:

 

  1. EHLO <domain>
  2. Server responds with “Hello <domain>”
  3. MAIL FROM: [email protected]
  4. Server responds with “Sender O.K.”
  5. RCPT TO: [email protected]
  6. RCPT TO: [email protected]
  7. DATA command is sent

 

At this point in the session, the proxy connection is sent to the mailbox server that hosts UserB. The server, seeing that UserB exists responds with Recipient o.k. but, due to tar-pit, this doesn’t quite reach the Frontend, yet. When the Mailbox server receives UserC, it responds with 5.1.1 but this is during the data stream. This causes an NDR to be generated for all recipients of the message. This also means that the valid recipient never receives the email.

 

This is an issue with Exchange 2013 and the fully supported scenario is to only activate the recipient filtering agent on the Frontend server. The Frontend will respond correctly with the “250 2.1.5 Recipient OK” and “5.1.1 Invalid Recipient”, while the correct recipient receives the email and an NDR is generated for the invalid recipient.

O365: Accessing Another Mailbox via OWA URL

NOTE: This post – drafted, composed, written, and published by me – originally appeared on https://blogs.technet.microsoft.com/johnbai and is potentially (c) Microsoft.

It may become necessary for an admin or delegate to access a mailbox (other than their own) in OWA. There’s two ways to do this and most people are familiar with the change in the URL method, which is what I’ll be covering in this post.

In Wave14 (Exchange 2010), you merely had to append the user’s smtp address to the suffix the OWA URL. So, for example, to access Amy Luu‘s mailbox in my test tenant, I would add the following: [email protected]

In Wave15 (Exchange 2013), we merely need to add another character at the end of the URL for this to work: [email protected]/